From tim.peters at gmail.com Sat Aug 1 07:16:14 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 1 Aug 2015 00:16:14 -0500 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: Message-ID: [Tim] >> Speaking of which, the current tzinfo API has no way to ask "is this >> an ambiguous time?" [Alexander Belopolsky] > I was hoping that we would agree on the name of the flag before > someone asks this question. :-) You doubtless noted that I called it "first" near the end of my message without putting up any stink at all ;-) > With my proposal, a naive datetime t is ambiguous in timezone tz if > > tz.utcoffset(t) < tz.utcoffset(t.replace(first=False)) > > or "is this an invalid (missing) time?" Unless I'm missing your intent entirely, that's a fine illustration of my "The logic is bound to be annoying enough that we'd want to concentrate it in tzstrict". The problem I see is that the expression you gave can never be true. The math is indeed trivial, but a key part works "the opposite" of how even people who've thought a lot about it "instinctively" believe. Two cases: 1. t.first is False. Then the expression obviously returns false (the LHS and RHS are applied to two datetimes all of whose components - including .first - are the same, so both utcoffset()s return the same value, and "<" is false because they're equal). But if t.first (False) is telling the truth, t _is_ the later of two ambiguous times. So we wanted an expression that returned true in that subcase. The result is correct only when t.first being False is lying (i.e., when t is not the later of two ambiguous times, but t.first is False despite that). 2. t.first is True. 2a. And t is not an ambiguous time. Then I expect the two utcoffset()s return the same value, and the expression correctly returns False 2b. And t is an ambiguous time. Then t is the earlier of the two times (that's what t.first is True means in this case) , and the constructed datetime is the later. Obviously the earlier time should compare less than the later time, but that's not what's being compared. The offsets _from_ UTC are being compared, and it's the earlier time that has the _greater_ offset (that's the part 90% of people "instinctively" get backwards). So again the expression returns False incorrectly (although would be correct in this case if ">" were used instead - but then 90% of people would instinctively think the logic is backwards). So in all cases the expression computes False - unless I'm missing your intent entirely (in which case I trust you won't be shy about enlightening me :-) ). Why do people get this backwards? I've pondered that off & on for a long time. I think it goes like this: at a given UTC time u, then, say, u+1 is obviously an earlier time than u+2. So the greater the offset the later the time. That's intuitively obvious. What it wholly misses is that it's got nothing to do with what we're _trying_ to ask ;-) We're trying to ask about how times act on a non-UTC clock. In the bogus reasoning, u+1 and u+2 look an hour apart on the UTC clock, so are irrelevant to the real question. When looking at a non-UTC clock, the offsets have to be _subtracted_ from that clock's idea of time to determine corresponding UTC time, and it's the negation that reverses the sense of the comparison needed. For an ambiguous local time T: offset1 < offset2 # if and only if (negate, which also reverses the direction of comparison) - offset1 > - offset2 # if and only if (add T to both sides) T - offset1 > T - offset2 # if and only if (and now we have the UTC equivalents) UTCtime1 > UTCtime2 So we can't expect most people to get this right. Wouldn't this work? t is ambiguous if and only if tz.utcoffset(t) != tz.utcoffset(t.replace(first=not t.first)) That is, t is ambiguous iff the value of t.first makes a difference to the offset. I expect people _could_ get that right most of the time, but may have trouble remembering "the trick". But nobody could screw up what, say, a new tz.is_ambiguous(t) means. > I was hoping to sneak in a rule that for an invalid time t > > tz.utcoffset(t) > tz.utcoffset(t.replace(first=False)) I don't want to try to figure out what that _really_ does, although as noted at the end of case 2b above that expression returns True when t.first is True and t is in fact the earlier of two ambiguous times. Because local "missing times" have no spelling in UTC, I doubt there's any way for simple .utcoffset() expressions to detect one reliably. IIRC, the Python docs say nothing whatsoever about how missing times are, or "should be", handled in conversion. But if the tzinfo class has any intelligence about the rules it's implementing, it should be easy for a new tz.is_missing_time(t) method to apply that intelligence. Or, say, just a single new tz.classify(t) method returning, say, an or'ing of flags from these two sets: # set 1 - exactly one will be in the result TZ_HAPPY_TIME = 1 TZ_MISSING_TIME = 2 TZ_AMBIGUOUS_TIME = 4 # set 2 - at most one will be in the result, and none with TZ_HAPPY_TIME TZ_DUE_TO_DST_TRANSITION = 64 TZ_DUE_TO_BASE_OFFSET_TRANSITION = 128 > (I really don't want tz.utcoffset(t) to ever raise an exception) Me neither. > and of course, for most of the times > > tz.utcoffset(t) == tz.utcoffset(t.replace(first=False)) Agreement at last ;-) Although I'd spell it tz.utcoffset(t) == tz.utcoffset(t.replace(first=not t.first) as a pretty direct translation of "the value of t.first makes no difference". >> The most important new question callers will want to resolve is "what should >> `first` (aka is_dst) be now?". > I want most callers Gloss: by "callers" I mean not just Python users, but also people _implementing_ the new stuff. Perhaps you do too. > to be able to get away with not knowing that > `first` exists and consistently get the earlier time from an ambiguous > input and some "nearby" time from an invalid input. In the case of a missing time, it's reasonable to guess they definitely intended a time later than the closest preceding (on the local clock) valid time. It's also reasonable to guess they definitely intended a time earlier than the closest succeeding (on the local clock) valid time. Happily, there is no possible local time satisfying both ;-) But there's no sensible way to compute either without knowing "the rules" (did DST cause us to miss an hour? an hour and 30 minutes? just 15 minutes? did a politician decree we lost 2 hours? in any case, how long ago did time _start_ to go missing? and when will it stop going missing?). Seems again a case that requires some intelligence _in_ the tzstrict class, not heroic efforts by callers restricted to utcoffset() alone. > A careful application will have to call tz.utcoffset() with both values of the > flag and either warn about the default choice or ask the user for an > additional input. As above, how can one programatically pick a valid default when faced with a missing time? The zoneinfo-like databases express this stuff by giving parameters for a specific algorithm. One relatively simple rule can cover a vast span of time. "It's easy" for code that _knows_ that stuff about the timezone. If all the programmer can know is .utcoffset() results at specific instants of time, I expect the best they can do is loop, incrementing (or decrementing) by a naive minute at a time (Python restricts UTC offsets to multiples of a minute), until they find a roundtrip fixed point (i.e., the nearest local time that "gets itself back" when converted to UTC and back again). Isaac earlier sketched a mathematical framework for a different approach to computing UTC offsets, which explicitly materialized that it's a function made up of a sequence of continuous monotonically increasing functions ("jumps in time" are discontinuities in the range, and that's what separates one function from the next). The start and end of each function's domain is explicit, and so then also are the start and end of each function's image. This makes pretty much all conceivable elementary questions solvable by, at worst, forms of binary search.(e.g.,"is this a missing time", "if so, what's the next closest valid time (in either direction)", "how long until the next transition of any kind?", "how many transitions of any kind occurred in the past 1000 years?" ...). But it's far more general than needed for any real-world time zone - and there's no code for that either ;-) From tim.peters at gmail.com Sat Aug 1 07:39:19 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 1 Aug 2015 00:39:19 -0500 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BABE77.9050802@stoneleaf.us> <55BACAB1.5080301@stoneleaf.us> Message-ID: [?ukasz Rekucki] >> What happens then when you substract a datetime with *strict* tzinfo >> and a *naive* one? Would A - B == - (B - A) still be true ? [Guido] > [re-adding the list] > > That's for the authors of the new PEP to decide, really, but I think it > could be made to follow the strict rules in both cases, since clearly the > code isn't an old program requiring backward compatibility (how would such a > program end up with a strict tzinfo?). This one solves itself: it's _already_ the case that subtraction of aware datetime objects uses timeline arithmetic _unless_ both datetimes share a .tzinfo member. If one uses a tzstrict instance and the other does not, it's impossible that they both use the same instance. So timeline arithmetic will be used in any such case. Maybe some sketchy pseudocode will make it more obvious: class datetime: ... def __sub__(x, y): # assume x and y are both aware datetimes if x.tzinfo is y.tzinfo: # compute the difference using classic arithmetic else: # compute the difference using timeline arithmetic It's been like that forever. Note that the order of the operands is irrelevant to which kind of arithmetic is used. The same applies, mutatis mutandis, to datetime comparison operations.. What does need to change is that "x.tzinfo is y.tzinfo" needs more qualification, so that two datetimes sharing the same tzstrict instance don't end up using classic subtraction. I'm sure the docs will become even more pleasant to read ;-) From ethan at stoneleaf.us Sat Aug 1 08:57:08 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 31 Jul 2015 23:57:08 -0700 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: Message-ID: <55BC6DC4.2060105@stoneleaf.us> On 07/31/2015 10:16 PM, Tim Peters wrote: > [Alexander Belopolsky] >> (I really don't want tz.utcoffset(t) to ever raise an exception) > [Tim] > Me neither. Why not? If the programmer is using strict tzinfos how would they end up with an invalid t? I only see two ways: - constructing from a literal (in which case an exception should be raised) - t is using a non-strict or missing tzinfo, possibly from an addition or subtraction (in which case we can't know which direction they were going and should not guess -- so raise an exception) -- ~Ethan~ From ischwabacher at wisc.edu Sat Aug 1 15:41:34 2015 From: ischwabacher at wisc.edu (ISAAC J SCHWABACHER) Date: Sat, 01 Aug 2015 13:41:34 +0000 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: Message-ID: > Isaac earlier sketched a mathematical framework for a different > approach to computing UTC offsets, which explicitly materialized that > it's a function made up of a sequence of continuous monotonically > increasing functions ("jumps in time" are discontinuities in the > range, and that's what separates one function from the next). The > start and end of each function's domain is explicit, and so then also > are the start and end of each function's image. This makes pretty > much all conceivable elementary questions solvable by, at worst, forms > of binary search.(e.g.,"is this a missing time", "if so, what's the > next closest valid time (in either direction)", "how long until the > next transition of any kind?", "how many transitions of any kind > occurred in the past 1000 years?" ...). *Almost* everything can be accomplished with tz.first_transition_after(dt) and tz.last_transition_at_or_before(dt) returning appropriate (trans_utc, before_info, after_info) tuples, but not quite. But yes, I think it would be valuable to expose the transition times in some way, though preferably not as a list since that would preclude POSIX-style time zones (which have an infinite number of such transitions). Does anyone else have a better idea for this API? > But it's far more general than needed for any real-world time zone - > and there's no code for that either ;-) Not yet. I had finally gotten to work on it and realized that the API I was going to propose was insufficient to the task. ijs From alexander.belopolsky at gmail.com Sat Aug 1 16:36:54 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 1 Aug 2015 10:36:54 -0400 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: Message-ID: On Sat, Aug 1, 2015 at 1:16 AM, Tim Peters wrote: >> With my proposal, a naive datetime t is ambiguous in timezone tz if >> >> tz.utcoffset(t) < tz.utcoffset(t.replace(first=False)) >> >> or "is this an invalid (missing) time?" > > Unless I'm missing your intent entirely, that's a fine illustration of > my "The logic is bound to be annoying enough that we'd want to > concentrate it in tzstrict". The problem I see is that the expression > you gave can never be true. You are absolutely right and this is the intent. The challenge that I tried to solve was that local-to-global function (G(t)) can have 0, 1 or 2 values if defined as mathematical inverse of the global-to-local (L(u)) function. (Purists would say that this means that local-to-global is not a function, but I find it convenient to say that a function has multiple values when it returns a variable-length list.) At the same time, I wanted naive code u = G(t) to (a) work for all values of t; (b) produce correct result when L(u) = t has only one solution; (c) produce one of the "correct: results when L(u) = t has two solutions; and (d) produce "useful" result when L(u) = t has no solutions. This ruled out the obvious design where G(t) would return [], [u] or [u0, u1] because all naive code that used u = G(t) would have to be rewritten as u = G(t)[0] and you would still face an index error when t is in the gap. (I've recently learned this useful terminology: the interval of non-existent local times that occurs when you move the clock forward is called a "gap" and the the interval of ambiguous local times that occurs when you move the clock back is called a "fold".) The other solution was to give G(t) an additional argument so that you could specify the index into the returned list upfront: def xG(t, which=0). return G(t)[which] this makes the naive u = xG(t) code work in 99.99% of the cases, but you still face an occasional index error. So how do you represent three outcomes [], [u] or [u0, u1] in a way that xG(t) always works? My solution: [] -> [u1, u0] [u] -> [u, u] [u0, u1] -> [u0, u1] Note that this solution satisfies all my design criteria including (d). The results produced from the time in a gap are "useful" because the default xG(t) result is what most people mean when they specify the time in the gap: they do it because they are unaware of the time change and expect 02:30 AM to be 150 minutes after midnight not knowing that it will be called 03:30 AM. The other solution is also useful because it allows you to detect the time in the gap without calling L(u) on the result. From alexander.belopolsky at gmail.com Sat Aug 1 17:27:14 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 1 Aug 2015 11:27:14 -0400 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: Message-ID: On Sat, Aug 1, 2015 at 10:36 AM, Alexander Belopolsky wrote: > So how do you represent three outcomes [], [u] or [u0, u1] in a way that xG(t) > always works? My solution: > > [] -> [u1, u0] > [u] -> [u, u] > [u0, u1] -> [u0, u1] Let me clarify what I propose to return for the local time in a gap: the two values u1 and u0 are *not* solutions to L(u) = t. For t in a gap, no such solutions exist. Instead, u0 is the solution for L0(u) = t where L0 is L linearly extrapolated from the times before the gap forward and u1 is the solution for L1(u) = t where L1 is L linearly extrapolated from the times after the gap back. In the case of the US-style spring jump from 01:59 to 03:00 AM, for t = 02:30 AM, u0 is such that L(u0) = 03:30 AM (this is the "what a meant when I said 02:30" time) and L(u1) = 01:30 AM. From ethan at stoneleaf.us Sat Aug 1 19:52:29 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 01 Aug 2015 10:52:29 -0700 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: Message-ID: <55BD075D.3090609@stoneleaf.us> On 08/01/2015 08:27 AM, Alexander Belopolsky wrote: > In the case of the US-style spring jump from 01:59 to 03:00 AM, for t > = 02:30 AM, u0 is such that L(u0) = 03:30 AM (this is the "what a > meant when I said 02:30" time) and L(u1) = 01:30 AM. The problem here is that if somebody is counting backwards to get that 1:30, then the time they need is 12:30, not 2:30. As a case in point: Today I have veterinary appointment for my cat to check his medication levels; the appointment is at 14:30, and needs to be in the window of 4 to 6 hours of him taking his meds. Counting backwards from 14:30 gives me a window of 8:30 to 10:30 to administer his meds. I'm happy to concede that counting backwards to get the start time is less frequent, and the times this hits/crosses a time shift are even less frequent, but that is all the more reason to refuse the temptation to guess. -- ~Ethan~ From alexander.belopolsky at gmail.com Sat Aug 1 20:18:11 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 1 Aug 2015 14:18:11 -0400 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: <55BD075D.3090609@stoneleaf.us> References: <55BD075D.3090609@stoneleaf.us> Message-ID: On Sat, Aug 1, 2015 at 1:52 PM, Ethan Furman wrote: > As a case in point: Today I have veterinary appointment for my cat to check > his medication levels; the appointment is at 14:30, and needs to be in the > window of 4 to 6 hours of him taking his meds. Counting backwards from 14:30 > gives me a window of 8:30 to 10:30 to administer his meds. > > I'm happy to concede that counting backwards to get the start time is less > frequent, and the times this hits/crosses a time shift are even less > frequent, but that is all the more reason to refuse the temptation to guess. My goal here is to minimize the impact on the programs that are already written and deployed. It is better to get a wrong time for one appointment than to loose all appointments due to a program crash. If your vet uses a naively written program, she probably already has a work-around for DST transitions. I want to make sure her workaround does not break with the future python releases. When you write a new program, you can easily write your own function that will do whatever you want when it detects an ambiguous time: from raising an exception to calling an airstrike on whatever agency is responsible for the DST transition. We may even provide convenience APIs in the tzstrict class to do something like the first option. What we cannot do is to make dt.utcoffset() or dt.dst() raise an exception in the case where they did not in the past releases of Python and popular timezone libraries. From ethan at stoneleaf.us Sat Aug 1 20:34:37 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 01 Aug 2015 11:34:37 -0700 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BD075D.3090609@stoneleaf.us> Message-ID: <55BD113D.5020609@stoneleaf.us> On 08/01/2015 11:18 AM, Alexander Belopolsky wrote: > On Sat, Aug 1, 2015 at 1:52 PM, Ethan Furman wrote: >> I'm happy to concede that counting backwards to get the start time is less >> frequent, and the times this hits/crosses a time shift are even less >> frequent, but that is all the more reason to refuse the temptation to guess. > > My goal here is to minimize the impact on the programs that are > already written and deployed. Why is this even a problem? Already written programs will not be using the new strict tzinfo. -- ~Ethan~ From alexander.belopolsky at gmail.com Sat Aug 1 21:09:27 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 1 Aug 2015 15:09:27 -0400 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: <55BD113D.5020609@stoneleaf.us> References: <55BD075D.3090609@stoneleaf.us> <55BD113D.5020609@stoneleaf.us> Message-ID: On Sat, Aug 1, 2015 at 2:34 PM, Ethan Furman wrote: >> My goal here is to minimize the impact on the programs that are >> already written and deployed. > > > Why is this even a problem? Already written programs will not be using the > new strict tzinfo. They will if their tzinfo provider switches to tzstrict. From alexander.belopolsky at gmail.com Sat Aug 1 21:27:00 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 1 Aug 2015 15:27:00 -0400 Subject: [Datetime-SIG] Local time disambiguation proposal In-Reply-To: References: Message-ID: On Thu, Jul 30, 2015 at 12:43 PM, Alexander Belopolsky wrote: > I would like to make a specific proposal on a narrow topic of how to > handle ambiguities inherent in representing time instances in common > time zones. I have now posted a first draft of the reference implementation at the bug tracker: Issue: http://bugs.python.org/issue24773 Patch: http://bugs.python.org/file40094/ltdf.patch I am also working on converting my write-up [1] for publication as a PEP. Please note that I have made substantial changes since my initial post, so if you comment on the proposal please make sure that you've read the latest version. [1]: https://github.com/abalkin/ltdf/blob/master/README.rst From ethan at stoneleaf.us Sat Aug 1 21:40:14 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 01 Aug 2015 12:40:14 -0700 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BD075D.3090609@stoneleaf.us> <55BD113D.5020609@stoneleaf.us> Message-ID: <55BD209E.4050006@stoneleaf.us> On 08/01/2015 12:09 PM, Alexander Belopolsky wrote: > On Sat, Aug 1, 2015 at 2:34 PM, Ethan Furman wrote: >>> My goal here is to minimize the impact on the programs that are >>> already written and deployed. >> >> >> Why is this even a problem? Already written programs will not be using the >> new strict tzinfo. > > They will if their tzinfo provider switches to tzstrict. It is not our problem if a third-party library makes backwards incompatible changes. We will still be supporting all three types in the stlib: naive, aware, and strict. If this new 'strict' support still has buggy behaviour around time shifts, why are we bothering? -- ~Ethan~ From alexander.belopolsky at gmail.com Sat Aug 1 21:58:54 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 1 Aug 2015 15:58:54 -0400 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: <55BD209E.4050006@stoneleaf.us> References: <55BD075D.3090609@stoneleaf.us> <55BD113D.5020609@stoneleaf.us> <55BD209E.4050006@stoneleaf.us> Message-ID: On Sat, Aug 1, 2015 at 3:40 PM, Ethan Furman wrote: > If this new 'strict' support still has buggy behaviour around time shifts, > why are we bothering? Because the new 'strict' way of dealing with timezones and datetime arithmetics will become our "gold standard" and we will not tolerate buggy behavior even in the corner cases? Speaking seriously, however, the part that I really want to salvage is Tim's clever algorithm that implements fromutc() once the user provided utcoffset() and dst() methods. This algorithm relies on a technically illegal operation: calling utcoffset() and dst() on a datetime instance representing time in UTC. If we allow utcoffset() or dst() raise an error on some values of datetime - this algorithm won't work. [1]: https://hg.python.org/cpython/file/f6a3310d3cc9/Lib/datetime.py#l957 From ethan at stoneleaf.us Sat Aug 1 22:51:31 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 01 Aug 2015 13:51:31 -0700 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BD075D.3090609@stoneleaf.us> <55BD113D.5020609@stoneleaf.us> <55BD209E.4050006@stoneleaf.us> Message-ID: <55BD3153.6050909@stoneleaf.us> On 08/01/2015 12:58 PM, Alexander Belopolsky wrote: > On Sat, Aug 1, 2015 at 3:40 PM, Ethan Furman wrote: >> If this new 'strict' support still has buggy behaviour around time shifts, >> why are we bothering? > > Because the new 'strict' way of dealing with timezones and datetime > arithmetics will become our "gold standard" and we will not tolerate > buggy behavior even in the corner cases? I have no doubt that adding and subtracting strict dt's will behave appropriately. I just also think that part our "gold standard" is refusing the temptation to guess what the result should have been when handed a non-strict dt that is either ambiguous or non-existent. -- ~Ethan~ From alexander.belopolsky at gmail.com Sat Aug 1 23:01:40 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 1 Aug 2015 17:01:40 -0400 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: <55BD3153.6050909@stoneleaf.us> References: <55BD075D.3090609@stoneleaf.us> <55BD113D.5020609@stoneleaf.us> <55BD209E.4050006@stoneleaf.us> <55BD3153.6050909@stoneleaf.us> Message-ID: On Sat, Aug 1, 2015 at 4:51 PM, Ethan Furman wrote: > I just also think that part our "gold standard" is refusing the temptation > to guess what the result should have been when handed a non-strict dt that > is either ambiguous or non-existent. Ethan, you responded to a joke, but snipped the substantive argument. From ethan at stoneleaf.us Sun Aug 2 00:35:59 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 01 Aug 2015 15:35:59 -0700 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BD075D.3090609@stoneleaf.us> <55BD113D.5020609@stoneleaf.us> <55BD209E.4050006@stoneleaf.us> <55BD3153.6050909@stoneleaf.us> Message-ID: <55BD49CF.8060508@stoneleaf.us> On 08/01/2015 02:01 PM, Alexander Belopolsky wrote: > On Sat, Aug 1, 2015 at 4:51 PM, Ethan Furman wrote: >> I just also think that part our "gold standard" is refusing the temptation >> to guess what the result should have been when handed a non-strict dt that >> is either ambiguous or non-existent. > > Ethan, you responded to a joke, but snipped the substantive argument. My apologies. I don't have much input about the actual algorithms to use, and, quite frankly, still find DST irritating and confusing. It wasn't until a couple years ago that somebody taught me "fall back, spring forward" so I could at least keep straight which way the clock was going to shift. So really my involvement here is as a user of these things -- and as a user I would be greatly frustrated with being able to hand an ambiguous/non-existent date/time to a strict dt/tzinfo and not get an error because sooner or later the guess would be wrong and I'd be wondering what happened. -- ~Ethan~ From alexander.belopolsky at gmail.com Sun Aug 2 01:02:40 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 1 Aug 2015 19:02:40 -0400 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: <55BD49CF.8060508@stoneleaf.us> References: <55BD075D.3090609@stoneleaf.us> <55BD113D.5020609@stoneleaf.us> <55BD209E.4050006@stoneleaf.us> <55BD3153.6050909@stoneleaf.us> <55BD49CF.8060508@stoneleaf.us> Message-ID: On Sat, Aug 1, 2015 at 6:35 PM, Ethan Furman wrote: > So really my involvement here is as a user of these things -- and as a user > I would be greatly frustrated with being able to hand an > ambiguous/non-existent date/time to a strict dt/tzinfo and not get an error > because sooner or later the guess would be wrong and I'd be wondering what > happened. Luckily, tzinfo does not currently have a "to_utc()" method, so we can implement it in tzstrict as def to_utc(self, dt): o0, o1 = self.utcoffset(dt), self.utcoffset(dt.replace(first=not dt.first)) if o0 == o1: return dt + o0 if (o0 < o1) == dt.first: raise AmbiguousLocalTimeError else: raise InvalidLocalTimeError However, some programers may prefer to get more than one bit of information whenever they query the timezone database. The problem with an InvalidLocalTimeError is that it does not help the user to correct it. Rather than saying "invalid time - try again", it would be more helpful to say something like: "02:30 is invalid, would you like to schedule your event for [ ] 01:30; [ ] 03:30; or other: ____". This will require knowledge that 01:30 and 03:30 are valid, which you will have after you called self.utcoffset(dt) and self.utcoffset(dt.replace(first=not dt.first)), but the hypothetical to_utc() function will discard that knowledge. From alexander.belopolsky at gmail.com Sun Aug 2 01:33:43 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 1 Aug 2015 19:33:43 -0400 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: Message-ID: On Sat, Aug 1, 2015 at 1:16 AM, Tim Peters wrote: >> A careful application will have to call tz.utcoffset() with both values of the >> flag and either warn about the default choice or ask the user for an >> additional input. > > As above, how can one programatically pick a valid default when faced > with a missing time? Suppose, on UTC time u=X, local clocks are advanced d > 0 units. Then the function L(u) that maps UTC time u to local time, can be written as L(u) = u + o + d * 1[u >= X] where o is the UTC offset before the transition and 1[] is the (Knuth?) indicator function. Let L0(u) = u + o and L1(u) = u + o + d. My proposal is that when t is between X + o and X + o + d and therefore t = L(u) has no solution, we should offer solutions to t = L0(u) and t = L1(t) instead. (These solutions are, BTW, t - o and t - o - d.) With the notation introduced so far, my "extended local-to-global function" xG(t, first=True) can be written as def xG(t, first=True): if X + o <= t < X + o + d: if first: return t - o else: return t - o - d .. # handle other times Note that xG(t, first=True) > xG(t, first=False) is the deliberate choice that makes it dead easy to detect that the returned values are not solutions of t = L(u). From tim.peters at gmail.com Sun Aug 2 11:11:03 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 2 Aug 2015 04:11:03 -0500 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: <55BA88F8.4080105@oddbird.net> References: <55BA88F8.4080105@oddbird.net> Message-ID: [Carl Meyer ] Sorry for the delay - there's just too much here to keep up with. I enjoyed and appreciated your essay, and will take time to give you straight answers about the "most important" question you asked. Of course I don't speak for Guido. Although I often do ;-) > ... > In order to defend the current model as coherent, one has to discard one > of the following points, and (despite having read every message in all > the related threads), I am still not clear precisely which one of these > Tim et al consider untrue or expendable: > > 1) A datetime with a tzinfo member that carries both a timezone and a > specific UTC offset within that timezone (e.g. a pytz timezone instance) > corresponds precisely and unambiguously to a single instant in > astronomical time (as well as carrying additional information). datetime had no intent to support "astronomical time" in any way, shape or form. It's no coincidence that, in Guido's first message about "naive time": https://mail.python.org/pipermail/python-dev/2002-March/020648.html he talked about "for most *business* uses of date and time". datetime development was suggested & funded by Zope Corporation, which mostly works to meet other businesses' "content management" needs. The use cases collected were overwhelmingly from the commercial business world. Astronomical time systems weren't on the table. In this respect, it's important to realize that while Python 3.2 finally supplied a concrete instance (of a tzinfo subclass) as "the standard" UTC timezone object (datetime.timezone.utc), that's still just an approximation: it wholly ignores that real-life UTC suffers from leap seconds added (or, perhaps some day also removed) at various times. Subtract two datetimes in `utc`, and the duration returned may be off from real life, but whether and by how much can only be determined by looking up the history of leap second adjustments (made to real-life UTC). Those who suspect "Noda Time" is what they really want should note that it ignores leap seconds too. As they say on their site, "We want to solve the 99% case. Noda Time doesn't support leap seconds, relativity or various other subtleties around time lines." Although in the Zope community (which mostly drove Python's datetime requirements), it was more the 99.997% case ;-) If an astronomical union had funded the project instead ... > 2) A timedelta object is clearly a Duration, not a Period, because > timedelta(days=1), timedelta(hours=24), and timedelta(seconds=86400) > result in indistinguishable objects. I think this point is > uncontroversial; Tim has said several times that a timedelta is just a > complicated representation of an integer number of microseconds. That's > a Duration. That's my view, yes. Although these are "naive time" microseconds too, with eternally fixed relation to all of naive time seconds, minutes, hours, days and weeks. In real-life UTC, you can't even say how long a minute is in seconds - "it depends". > 3) If one has two datetime objects that each unambiguously correspond to > an instant in real time, and one subtracts them and gets back an object > which represents a Duration in real microseconds, the only reasonable > content for that Duration is the elapsed microseconds in real time between > the two instants. Since there's no accounting for leap seconds, this cannot always be true using tzinfo objects approximating real-life UTC, or any timezone defined as offsetting real-life UTC. Which is all of 'em ;-) So what's the hangup with leap seconds? They're of no use to business applications, but would introduce irregularities business logic is ill-prepared to deal with. Same as DST transitions, leap-second adjustments can create missing and ambiguous times on a local clock. But unlike DST transitions, which occur in each jurisdiction at a time picked to be minimally visible in the jurisdiction (wee hour on a weekend), leap-second adjustments occur at a fixed UTC time, which is usually "in the middle of the work day" in _some_ jurisdictions. For that reason, when a leap second was inserted this year, some major financial markets across the world - normally open at the adjustment time! - shut down temporarily rather than risk a cascade of software disasters: http://money.cnn.com/2015/06/29/technology/leap-second/ I'm glad they did. Example: The order in which trades are executed (based on timestamps with sub-second resolution) can have legal consequences. For example, a big customer calls a broker and tells them to buy a million shares of Apple stock. The broker thinks "good idea!". He tells his software to place the customer buy order, then wait a millisecond, then send an order to buy a thousand shares for his own account. That's legal. If the orders are placed in the opposite order, it's illegal and the broker could go to jail ("front running", placing his order first _knowing_ that a large order will soon follow; the large order will certainly drive the stock price up, benefiting the broker who bought before the thoroughly predictable rise). Inserting a leap second causes the local clock to "repeat a second" in its idea of time (just as "inserting an hour" at the end of DST causes local clocks to repeat an hour) - or to blow up. A repeated second could cause the orders in the example above to _appear_ to have arrived in "the other" order. Even if the system time services report a time like 13:59:60.000 (instead of repeating 13:59:59.000), lots of software never expected to see such a thing. Who knows what may happen? So I doubt datetime will ever use "real UTC". It's pretty horrid! For another example, what will the UTC calendar date and time be 300 million seconds from now? That's simply impossible to compute for real UTC, not even in theory. Saying how many seconds away it will be is trivial (300 million!), but the physical processes causing leap second adjustments to UTC are chaotic - nobody can predict how many leap second adjustments will be made to UTC over the next 300 million seconds, or when, so there's no way to know what the UTC calendar date and time will be then. It _can_ affect the calendar date-and-time even for times just half a year in the future . Unless the definition of UTC is changed yet again (dead serious proposals for which are pending, supported by most participating countries): https://en.wikipedia.org/wiki/Leap_second#Proposal_to_abolish_leap_seconds That page is also interesting for its account of various software problems known to have been caused so far by leap-second adjustments. Anyway, under "real UTC" today, you could get an excellent approximation of "real time durations" by subtracting, but would have to accept that there is no fixed mapping between UTC timeline points and calendar notations except for datetimes no later than about 3 months from now (best I can tell, "they" don't promise to give more than 3 month notice before the next leap second adjustment). Finally, I have to note the irony in asking anything about "real time" ;-) What does "real time" mean? The most accurate clocks we have are atomic clocks, but even when two are made as identically as possible - even if we made two that kept _perfect_ time forever - they will _appear_ to run at slightly different rates when placed at different locations on Earth. That's at least due to gravitational time dilation: relativistic effects matter at currently achievable resolutions. As a result, current TAI time (the astonishingly uniform "atomic time" measure from which today's definition of UTC is derived) can't be known _as_ it happens: it's the output of an algorithm (which consumes time!) that collects "elapsed seconds" from hundreds of ultra-stable clocks around the globe, and averages them in a way to make a highly informed, provably excellent guess at what they would have said had they all been flawless, all at mean sea level altitude, and all at 0 degrees Kelvin. This computed "TAI time" is out of date by the time it's known, and typically disagrees (slightly) with most of the clocks feeding into it. So the best measure of "real time" we have is a product of human ingenuity. The closer to "plain old unadulterated real time as it exists in nature" you want to get, the more contrived & bogglingly complex the means needed to achieve it ;-) Everyone is settling for an approximation, because that's the best that can be done. Naive time starts and stops with what most people "already know". When UTC started mucking with leap seconds (it didn't always), the computing world should have embraced TAI internally instead. TAI suffers no adjustments of any kind, ever - it's just the running total of SI seconds since the start of the TAI epoch, as determined by the best clocks on Earth. In fact, it's very close to Python's "naive time"! TAI uses the propleptic Gregorian calendar too (albeit starting at a different epoch than year 1), and the TAI "day" is also defined to be exactly 86400 SI seconds. The difference is that TAI's Gregorian calendar will, over time, become unboundedly out of synch with UTC's Gregorian calendar, as leap seconds pile up in the latter. So far they're only 36 seconds out of synch. > ... > To be clear, I'm not arguing that this behavior can now be changed in > the existing library objects in a backwards-incompatible way. But > accepting that it is lacking in internal coherence (rather than just > being an "alternative and equally good model") would be useful in > clarifying what kind of an implementation we actually want (IMO, > something very much like JodaTime/NodaTime). And then can we figure out > how to get there from here. I mentioned Noda Time before. Just looked up Joda-Time, and: http://joda-time.sourceforge.net/faq.html """ Joda-Time does not support leap seconds. Leap seconds can be supported by writing a new, specialized chronology, or by making a few enhancements to the existing ZonedChronology class. In either case, future versions of Joda-Time will not enable leap seconds by default. Most applications have no need for it, and it might have additional performance costs. """ There's a pattern here: "almost all" people want nothing to do with leap seconds, not even time library developers. That doesn't mean they're right. But it doesn't mean they're wrong either ;-) Without leap seconds, they're all approximating real-life UTC, and in the same way Python's `utc` is. From guido at python.org Sun Aug 2 15:46:47 2015 From: guido at python.org (Guido van Rossum) Date: Sun, 2 Aug 2015 15:46:47 +0200 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> Message-ID: There's a simpler reason for ignoring leap seconds in datetime: Python's wall clock is mappable to POSIX timestamps, which also ignore leap seconds (the reason being Tim's long explanation :-). On Sun, Aug 2, 2015 at 11:11 AM, Tim Peters wrote: > [Carl Meyer ] > > Sorry for the delay - there's just too much here to keep up with. I > enjoyed and appreciated your essay, and will take time to give you > straight answers about the "most important" question you asked. Of > course I don't speak for Guido. Although I often do ;-) > > > ... > > In order to defend the current model as coherent, one has to discard one > > of the following points, and (despite having read every message in all > > the related threads), I am still not clear precisely which one of these > > Tim et al consider untrue or expendable: > > > > 1) A datetime with a tzinfo member that carries both a timezone and a > > specific UTC offset within that timezone (e.g. a pytz timezone instance) > > corresponds precisely and unambiguously to a single instant in > > astronomical time (as well as carrying additional information). > > datetime had no intent to support "astronomical time" in any way, > shape or form. It's no coincidence that, in Guido's first message > about "naive time": > > https://mail.python.org/pipermail/python-dev/2002-March/020648.html > > he talked about "for most *business* uses of date and time". datetime > development was suggested & funded by Zope Corporation, which mostly > works to meet other businesses' "content management" needs. The use > cases collected were overwhelmingly from the commercial business > world. > > Astronomical time systems weren't on the table. In this respect, it's > important to realize that while Python 3.2 finally supplied a concrete > instance (of a tzinfo subclass) as "the standard" UTC timezone object > (datetime.timezone.utc), that's still just an approximation: it > wholly ignores that real-life UTC suffers from leap seconds added (or, > perhaps some day also removed) at various times. Subtract two > datetimes in `utc`, and the duration returned may be off from real > life, but whether and by how much can only be determined by looking up > the history of leap second adjustments (made to real-life UTC). > > Those who suspect "Noda Time" is what they really want should note > that it ignores leap seconds too. As they say on their site, "We want > to solve the 99% case. > Noda Time doesn't support leap seconds, relativity or various other > subtleties around time lines." Although in the Zope community (which > mostly drove Python's datetime requirements), it was more the 99.997% > case ;-) > > If an astronomical union had funded the project instead ... > > > > 2) A timedelta object is clearly a Duration, not a Period, because > > timedelta(days=1), timedelta(hours=24), and timedelta(seconds=86400) > > result in indistinguishable objects. I think this point is > > uncontroversial; Tim has said several times that a timedelta is just a > > complicated representation of an integer number of microseconds. That's > > a Duration. > > That's my view, yes. Although these are "naive time" microseconds > too, with eternally fixed relation to all of naive time seconds, > minutes, hours, days and weeks. In real-life UTC, you can't even say > how long a minute is in seconds - "it depends". > > > > 3) If one has two datetime objects that each unambiguously correspond to > > an instant in real time, and one subtracts them and gets back an object > > which represents a Duration in real microseconds, the only reasonable > > content for that Duration is the elapsed microseconds in real time > between > > the two instants. > > Since there's no accounting for leap seconds, this cannot always be > true using tzinfo objects approximating real-life UTC, or any timezone > defined as offsetting real-life UTC. Which is all of 'em ;-) > > So what's the hangup with leap seconds? They're of no use to business > applications, but would introduce irregularities business logic is > ill-prepared to deal with. Same as DST transitions, leap-second > adjustments can create missing and ambiguous times on a local clock. > But unlike DST transitions, which occur in each jurisdiction at a time > picked to be minimally visible in the jurisdiction (wee hour on a > weekend), leap-second adjustments occur at a fixed UTC time, which is > usually "in the middle of the work day" in _some_ jurisdictions. For > that reason, when a leap second was inserted this year, some major > financial markets across the world - normally open at the adjustment > time! - shut down temporarily rather than risk a cascade of software > disasters: > > http://money.cnn.com/2015/06/29/technology/leap-second/ > > I'm glad they did. Example: The order in which trades are executed > (based on timestamps with sub-second resolution) can have legal > consequences. For example, a big customer calls a broker and tells > them to buy a million shares of Apple stock. The broker thinks "good > idea!". He tells his software to place the customer buy order, then > wait a millisecond, then send an order to buy a thousand shares for > his own account. That's legal. If the orders are placed in the > opposite order, it's illegal and the broker could go to jail ("front > running", placing his order first _knowing_ that a large order will > soon follow; the large order will certainly drive the stock price up, > benefiting the broker who bought before the thoroughly predictable > rise). > > Inserting a leap second causes the local clock to "repeat a second" in > its idea of time (just as "inserting an hour" at the end of DST causes > local clocks to repeat an hour) - or to blow up. A repeated second > could cause the orders in the example above to _appear_ to have > arrived in "the other" order. Even if the system time services report > a time like 13:59:60.000 (instead of repeating 13:59:59.000), lots of > software never expected to see such a thing. Who knows what may > happen? > > So I doubt datetime will ever use "real UTC". It's pretty horrid! > For another example, what will the UTC calendar date and time be 300 > million seconds from now? That's simply impossible to compute for > real UTC, not even in theory. Saying how many seconds away it will be > is trivial (300 million!), but the physical processes causing leap > second adjustments to UTC are chaotic - nobody can predict how many > leap second adjustments will be made to UTC over the next 300 million > seconds, or when, so there's no way to know what the UTC calendar date > and time will be then. It _can_ affect the calendar date-and-time > even for times just half a year in the future . Unless the definition > of UTC is changed yet again (dead serious proposals for which are > pending, supported by most participating countries): > > > https://en.wikipedia.org/wiki/Leap_second#Proposal_to_abolish_leap_seconds > > That page is also interesting for its account of various software > problems known to have been caused so far by leap-second adjustments. > > Anyway, under "real UTC" today, you could get an excellent > approximation of "real time durations" by subtracting, but would have > to accept that there is no fixed mapping between UTC timeline points > and calendar notations except for datetimes no later than about 3 > months from now (best I can tell, "they" don't promise to give more > than 3 month notice before the next leap second adjustment). > > Finally, I have to note the irony in asking anything about "real time" > ;-) What does "real time" mean? The most accurate clocks we have are > atomic clocks, but even when two are made as identically as possible - > even if we made two that kept _perfect_ time forever - they will > _appear_ to run at slightly different rates when placed at different > locations on Earth. That's at least due to gravitational time > dilation: relativistic effects matter at currently achievable > resolutions. As a result, current TAI time (the astonishingly uniform > "atomic time" measure from which today's definition of UTC is derived) > can't be known _as_ it happens: it's the output of an algorithm > (which consumes time!) that collects "elapsed seconds" from hundreds > of ultra-stable clocks around the globe, and averages them in a way to > make a highly informed, provably excellent guess at what they would > have said had they all been flawless, all at mean sea level altitude, > and all at 0 degrees Kelvin. This computed "TAI time" is out of date > by the time it's known, and typically disagrees (slightly) with most > of the clocks feeding into it. > > So the best measure of "real time" we have is a product of human > ingenuity. The closer to "plain old unadulterated real time as it > exists in nature" you want to get, the more contrived & bogglingly > complex the means needed to achieve it ;-) > > Everyone is settling for an approximation, because that's the best > that can be done. Naive time starts and stops with what most people > "already know". > > When UTC started mucking with leap seconds (it didn't always), the > computing world should have embraced TAI internally instead. TAI > suffers no adjustments of any kind, ever - it's just the running total > of SI seconds since the start of the TAI epoch, as determined by the > best clocks on Earth. In fact, it's very close to Python's "naive > time"! TAI uses the propleptic Gregorian calendar too (albeit > starting at a different epoch than year 1), and the TAI "day" is also > defined to be exactly 86400 SI seconds. The difference is that TAI's > Gregorian calendar will, over time, become unboundedly out of synch > with UTC's Gregorian calendar, as leap seconds pile up in the latter. > So far they're only 36 seconds out of synch. > > > ... > > To be clear, I'm not arguing that this behavior can now be changed in > > the existing library objects in a backwards-incompatible way. But > > accepting that it is lacking in internal coherence (rather than just > > being an "alternative and equally good model") would be useful in > > clarifying what kind of an implementation we actually want (IMO, > > something very much like JodaTime/NodaTime). And then can we figure out > > how to get there from here. > > I mentioned Noda Time before. Just looked up Joda-Time, and: > > http://joda-time.sourceforge.net/faq.html > > """ > Joda-Time does not support leap seconds. Leap seconds can be supported > by writing a new, specialized chronology, or by making a few > enhancements to the existing ZonedChronology class. In either case, > future versions of Joda-Time will not enable leap seconds by default. > Most applications have no need for it, and it might have additional > performance costs. > """ > > There's a pattern here: "almost all" people want nothing to do with > leap seconds, not even time library developers. That doesn't mean > they're right. But it doesn't mean they're wrong either ;-) Without > leap seconds, they're all approximating real-life UTC, and in the same > way Python's `utc` is. > _______________________________________________ > Datetime-SIG mailing list > Datetime-SIG at python.org > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sun Aug 2 16:35:01 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 2 Aug 2015 10:35:01 -0400 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> Message-ID: On Sun, Aug 2, 2015 at 9:46 AM, Guido van Rossum wrote: > There's a simpler reason for ignoring leap seconds in datetime: Python's > wall clock is mappable to POSIX timestamps, which also ignore leap seconds > (the reason being Tim's long explanation :-). Note that if you combine several recently made proposals, you can have a fully backward-compatible solution for leap seconds. You can simply use time(23, 59, ss, us, first=False) to stand for times in the 23:60 minute and delegate arithmetics to the tzinfo object: def __add__(self, delta): try: add = self.tzinfo.add except AttributeError: # current logic else: return add(self, delta) Same for `__sub__` and possibly `strftime`, `isoformat`, etc. if you want time(23, 59, ss, us, first=False) to be printed as 23:60:ss.us. Note that Olson database contains information about leap seconds, so once you have access to that and a way to spell 23:60 (as the repeated 23:59!) you can easily implement "correct" UTC timezone. On the other hand, I would like to proceed in baby steps. Let's first implement the means to disambiguate repeated times and various improvements that it will allow to the existing datetime functionality. Meanwhile the developers of timezone libraries will hopefully embrace the new feature and improve their offerings. Hopefully, by the time we are ready to distribute a full TZ database with Python, the problem of distribution will be solved by IETF [1]. If their solution (as expected) includes leap second information, we can find a way to give Python users access to it in one way or another. [1]: https://tools.ietf.org/html/draft-ietf-tzdist-service-11 From tim.peters at gmail.com Sun Aug 2 17:36:21 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 2 Aug 2015 10:36:21 -0500 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> Message-ID: [Guido] > There's a simpler reason for ignoring leap seconds in datetime: Python's > wall clock is mappable to POSIX timestamps, which also ignore leap seconds > (the reason being Tim's long explanation :-). Ya, but I wanted to give some reasons that make actual sense ;-) Because when time wonks get agitated about this, it's just like American politics: both sides dig in and endlessly repeat the same talking points with ever-increasing volume. "Because POSIX said so" was smashed by the ever-affable Daniel Bernstein a long time ago: http://cr.yp.to/proto/utctai.html ... The main obstacle is POSIX. POSIX is a ``standard'' designed by a vendor consortium several years ago to eliminate progress and protect the installed base. The behavior of the broken localtime() libraries was documented and turned into a POSIX requirement. Fortunately, the POSIX rules are so outrageously dumb---for example, they require that 2100 be a leap year, contradicting the Gregorian calendar---that no self-respecting engineer would obey them. See? You've been paid off by vendors to eliminate progress, to protect their ill-gotten gains. datetime is just another tool of capitalist pigs to ensure they increase their profits at the expense of the people. Just thought I'd put that out there first ;-) boldly-speaking-truth-to-power-since-2015-ly y'rs - tmi From lrekucki at gmail.com Sun Aug 2 19:32:58 2015 From: lrekucki at gmail.com (=?UTF-8?Q?=C5=81ukasz_Rekucki?=) Date: Sun, 2 Aug 2015 19:32:58 +0200 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> Message-ID: On Sunday, August 2, 2015, Alexander Belopolsky < alexander.belopolsky at gmail.com > wrote: > On Sun, Aug 2, 2015 at 9:46 AM, Guido van Rossum wrote: > > There's a simpler reason for ignoring leap seconds in datetime: Python's > > wall clock is mappable to POSIX timestamps, which also ignore leap > seconds > > (the reason being Tim's long explanation :-). > > Note that if you combine several recently made proposals, you can have > a fully backward-compatible solution for leap seconds. I don't have much time to respond now, but please don't add leap seconds. People are actively working to get rid of them entirely and they are never useful on the business level which this module is aimed at. > You can > simply use time(23, 59, ss, us, first=False) to stand for times in the > 23:60 minute and delegate arithmetics to the tzinfo object: > > def __add__(self, delta): > try: > add = self.tzinfo.add > except AttributeError: > # current logic > else: > return add(self, delta) > > Same for `__sub__` and possibly `strftime`, `isoformat`, etc. if you > want time(23, 59, ss, us, first=False) to be printed as 23:60:ss.us. > > Note that Olson database contains information about leap seconds, so > once you have access to that and a way to spell 23:60 (as the repeated > 23:59!) you can easily implement "correct" UTC timezone. > > On the other hand, I would like to proceed in baby steps. Let's > first implement the means to disambiguate repeated times and various > improvements that it will allow to the existing datetime > functionality. Meanwhile the developers of timezone libraries will > hopefully embrace the new feature and improve their offerings. > > Hopefully, by the time we are ready to distribute a full TZ database > with Python, the problem of distribution will be solved by IETF [1]. > If their solution (as expected) includes leap second information, we > can find a way to give Python users access to it in one way or > another. > > [1]: https://tools.ietf.org/html/draft-ietf-tzdist-service-11 > _______________________________________________ > Datetime-SIG mailing list > Datetime-SIG at python.org > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > -- ?ukasz Rekucki -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Sun Aug 2 20:52:10 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 02 Aug 2015 11:52:10 -0700 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> Message-ID: <55BE66DA.7010206@stoneleaf.us> On 08/02/2015 10:32 AM, ?ukasz Rekucki wrote: > I don't have much time to respond now, but please don't add leap seconds. People > are actively working to get rid of them entirely and they are never useful on the > business level which this module is aimed at. I don't know whether we should add support for leap-seconds to the new strict tzinfo, but I will mention that the proposal to get rid of them keeps being postponed, and leap seconds are actively harming the business community precisely because they are not being accounted for. -- ~Ethan~ From guido at python.org Sun Aug 2 21:19:56 2015 From: guido at python.org (Guido van Rossum) Date: Sun, 2 Aug 2015 21:19:56 +0200 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: <55BE66DA.7010206@stoneleaf.us> References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> Message-ID: On Sun, Aug 2, 2015 at 8:52 PM, Ethan Furman wrote: > On 08/02/2015 10:32 AM, ?ukasz Rekucki wrote: > > I don't have much time to respond now, but please don't add leap seconds. >> People >> are actively working to get rid of them entirely and they are never >> useful on the >> business level which this module is aimed at. >> > > I don't know whether we should add support for leap-seconds to the new > strict tzinfo, but I will mention that the proposal to get rid of them > keeps being postponed, and leap seconds are actively harming the business > community precisely because they are not being accounted for. > I like the idea of using a special tzinfo to reveal the leap seconds for those who really want them. (And we won't have to provide such a tzinfo -- it's enough that one could be written, given a table of leap seconds.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From 4kir4.1i at gmail.com Sun Aug 2 22:17:11 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Sun, 02 Aug 2015 23:17:11 +0300 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: (=?utf-8?Q?=22=C5=81ukasz?= Rekucki"'s message of "Sun, 2 Aug 2015 19:32:58 +0200") References: <55BA88F8.4080105@oddbird.net> Message-ID: <878u9tqwc8.fsf@gmail.com> ?ukasz Rekucki writes: R> On Sunday, August 2, 2015, Alexander Belopolsky < > alexander.belopolsky at gmail.com > > wrote: > >> On Sun, Aug 2, 2015 at 9:46 AM, Guido van Rossum wrote: >> > There's a simpler reason for ignoring leap seconds in datetime: Python's >> > wall clock is mappable to POSIX timestamps, which also ignore leap >> seconds >> > (the reason being Tim's long explanation :-). >> >> Note that if you combine several recently made proposals, you can have >> a fully backward-compatible solution for leap seconds. > > > I don't have much time to respond now, but please don't add leap seconds. > People are actively working to get rid of them entirely and they are never > useful on the business level which this module is aimed at. > "never useful on the business level" -- you can't guarantee that your input won't contain a leap second e.g., 2012-06-30 23:59:60.209215 see http://stackoverflow.com/questions/21027639/python-datetime-not-accounting-for-leap-second-properly Currently, datetime does not merely ignores leap seconds. It's hostile to them. I don't understand why do I have to write: import time from calendar import timegm from datetime import datetime, timedelta time_string = '2012-06-30T23:59:60.209215' time_string, dot, us = time_string.partition('.') utc_time_tuple = time.strptime(time_string, "%Y-%m-%dT%H:%M:%S") dt = datetime(1970, 1, 1) + timedelta(seconds=timegm(utc_time_tuple)) if dot: dt = dt.replace(microsecond=datetime.strptime(us, '%f').microsecond) print(dt) # -> 2012-07-01 00:00:00.209215 Instead of a simple: from datetime import datetime time_string = '2012-06-30T23:59:60.209215' dt = datetime.strptime(time_string, "%Y-%m-%dT%H:%M:%S.%f") print(dt) # -> 2012-07-01 00:00:00.209215 If datetime can't represent UTC time (a leap second) then it should behave like a broken-down POSIX time consistently: the same datetime object may refer to a different UTC time i.e., datetime(2012, 7, 1) may refer to both 2012-07-01UTC and 2012-06-30 23:59:60UTC like 1341100800 POSIX time does: 2012-07-01UTC -> 1341100800 2012-06-30 23:59:60UTC -> 1341100800 and in reverse: 1341100800 -> 2012-07-01UTC Related: http://bugs.python.org/issue23574 From lrekucki at gmail.com Sun Aug 2 22:26:16 2015 From: lrekucki at gmail.com (=?UTF-8?Q?=C5=81ukasz_Rekucki?=) Date: Sun, 2 Aug 2015 22:26:16 +0200 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> Message-ID: On 2 August 2015 at 21:19, Guido van Rossum wrote: > On Sun, Aug 2, 2015 at 8:52 PM, Ethan Furman wrote: >> >> On 08/02/2015 10:32 AM, ?ukasz Rekucki wrote: >> >>> I don't have much time to respond now, but please don't add leap seconds. >>> People >>> are actively working to get rid of them entirely and they are never >>> useful on the >>> business level which this module is aimed at. >> >> >> I don't know whether we should add support for leap-seconds to the new >> strict tzinfo, but I will mention that the proposal to get rid of them keeps >> being postponed, and leap seconds are actively harming the business >> community precisely because they are not being accounted for. > > > I like the idea of using a special tzinfo to reveal the leap seconds for > those who really want them. (And we won't have to provide such a tzinfo -- > it's enough that one could be written, given a table of leap seconds.) > But if we don't write a one, how do we know it's possible to do? IMHO, the same approach was taken for tzinfo and DST and it didn't work out very well. For example, I probably could implement a timezone which represents TAI (assuming I have always up-to-date leap seconds table), but it is not possible because tzinfo.utcoffset() requires me to return an integer number of minutes. Most times you need leap seconds, is to ignore them (as Tim described in his trading stock example). Adding support to a single programming language and fixing it every application, won't make the world better. Instead, you can use UTC-SLS or do what Google did. [1]: http://www.cl.cam.ac.uk/~mgk25/time/utc-sls/ -- ?ukasz Rekucki From ethan at stoneleaf.us Sun Aug 2 22:34:11 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 02 Aug 2015 13:34:11 -0700 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> Message-ID: <55BE7EC3.2020000@stoneleaf.us> On 08/02/2015 01:26 PM, ?ukasz Rekucki wrote: > Most times you need leap seconds, is to ignore them (as Tim described > in his trading stock example). Adding support to a single programming > language and fixing it every application, won't make the world better. > Instead, you can use UTC-SLS or do what Google did. You mean to shut down the markets around the time of the leap second? Or risk other systems shut down for hours? That strikes me as a massive failure of software. -- ~Ethan~ From lrekucki at gmail.com Sun Aug 2 22:42:01 2015 From: lrekucki at gmail.com (=?UTF-8?Q?=C5=81ukasz_Rekucki?=) Date: Sun, 2 Aug 2015 22:42:01 +0200 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: <55BE7EC3.2020000@stoneleaf.us> References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <55BE7EC3.2020000@stoneleaf.us> Message-ID: On 2 August 2015 at 22:34, Ethan Furman wrote: > On 08/02/2015 01:26 PM, ?ukasz Rekucki wrote: > >> Most times you need leap seconds, is to ignore them (as Tim described >> in his trading stock example). Adding support to a single programming >> language and fixing it every application, won't make the world better. >> Instead, you can use UTC-SLS or do what Google did. > > > You mean to shut down the markets around the time of the leap second? Or > risk other systems shut down for hours? I didn't hear about *Google* shutting down for hours. I do believe the leap second problem can be solved on a OS-level instead of having to fix all applications in the world. -- ?ukasz Rekucki From alexander.belopolsky at gmail.com Sun Aug 2 22:53:13 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 2 Aug 2015 16:53:13 -0400 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> Message-ID: On Sun, Aug 2, 2015 at 4:26 PM, ?ukasz Rekucki wrote: > But if we don't write a one, how do we know it's possible to do? IMHO, > the same approach was taken for tzinfo and DST and it didn't work out > very well. This is unfair. Python included extensively tested tzinfo implementations in its documentation and the test suit from the very beginning. Third party libraries came up with their own more complete implementations and even addressed some of the limitations inherent in the original design. Overall, I don't think Python's history of timezone support is any worse that that of any other language of its age. From ethan at stoneleaf.us Sun Aug 2 23:21:01 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Sun, 02 Aug 2015 14:21:01 -0700 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <55BE7EC3.2020000@stoneleaf.us> Message-ID: <55BE89BD.7010002@stoneleaf.us> On 08/02/2015 01:42 PM, ?ukasz Rekucki wrote: > I didn't hear about *Google* shutting down for hours. I do believe the > leap second problem can be solved on a OS-level instead of having to > fix all applications in the world. Fair point about Google. I'm happy with a strict tzinfo at least having the ability to deal with leap seconds -- then those that care can make it work. -- ~Ethan~ From chris.barker at noaa.gov Mon Aug 3 17:32:03 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Mon, 3 Aug 2015 08:32:03 -0700 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: <55BE89BD.7010002@stoneleaf.us> References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <55BE7EC3.2020000@stoneleaf.us> <55BE89BD.7010002@stoneleaf.us> Message-ID: <6315761166972360255@unknownmsgid> > I'm happy with a strict tzinfo at least having the ability to deal with leap seconds -- then those that care can make it work. I'm a bit confused about what the strict tzinfo object has to do with leap seconds. Leap seconds are more like leap years than DST transitions. (Except that we don't know when they will occur in the future). An extra second (:60) is added, rather than repeating one. So it is never ambiguous what is meant. And all time zones handle them the same, yes? So consideration for leap seconds belongs in with the code that handles converting the Gregorian calendar representation to/from a timespan. The other consideration is that time parsing code can't barf on a 60th second, just like it can't barf on feb 29th. All this code is in datetime, isn't it? Please tell me there are no leap seconds in the middle of DST transitions! What am I missing here? By the way, I've generally been of the opinion that leap seconds didn't matter to me, or most people. But apparently there have been 26 leap seconds since 1972. That's actually quite a bit, even for people that aren't doing atomic physics.... -Chris > -- > ~Ethan~ > _______________________________________________ > Datetime-SIG mailing list > Datetime-SIG at python.org > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: https://www.python.org/psf/codeofconduct/ From alexander.belopolsky at gmail.com Mon Aug 3 17:54:33 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 3 Aug 2015 11:54:33 -0400 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: <6315761166972360255@unknownmsgid> References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <55BE7EC3.2020000@stoneleaf.us> <55BE89BD.7010002@stoneleaf.us> <6315761166972360255@unknownmsgid> Message-ID: On Mon, Aug 3, 2015 at 11:32 AM, Chris Barker - NOAA Federal wrote: >> I'm happy with a strict tzinfo at least having the ability to deal with leap seconds -- then those that care can make it work. > > I'm a bit confused about what the strict tzinfo object has to do with > leap seconds. > > Leap seconds are more like leap years than DST transitions. (Except > that we don't know when they will occur in the future). An extra > second (:60) is added, rather than repeating one. So it is never > ambiguous what is meant. Yes, but if instead of having February 29, people decided that they would have two February 28s every four years, they would have the same calendar in all other days, but computers would have an issue similar to what we have at the end of DST. Even though the rest of the world spells the leap second as 23:59:60, we can spell it as 23:59:59(repeated) and represent as time(23, 59, 59, first=False). This is nothing but notation. (Note that there is no spelling for these times as POSIX "seconds since EPOCH", so if we try to map them to POSIX timestamps we will have to repeat a second.) A timezone that implements 23:59:59 + 1s = 23:59:60 and 23:59:60 + 1s = 00:00:00 can just as easily implement 23:59:59 + 1s = 23:59:59(repeated) and 23:59:59(repeated) + 1s = 00:00:00. The advantage of the flag over extending the range of minutes is that the added flag will be ignored by the older programs that will continue to function producing results that may be off by 1s, but will not crash with "ValueError: second must be in 0..59". From chris.barker at noaa.gov Mon Aug 3 18:49:14 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 3 Aug 2015 09:49:14 -0700 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <55BE7EC3.2020000@stoneleaf.us> <55BE89BD.7010002@stoneleaf.us> <6315761166972360255@unknownmsgid> Message-ID: On Mon, Aug 3, 2015 at 8:54 AM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > Leap seconds are more like leap years than DST transitions. (Except > > that we don't know when they will occur in the future). An extra > > second (:60) is added, rather than repeating one. So it is never > > ambiguous what is meant. > > Yes, but if instead of having February 29, people decided that they > would have two February 28s every four years, they would have the same > calendar in all other days, but computers would have an issue similar > to what we have at the end of DST. sure, but that's not what's done. > Even though the rest of the world > spells the leap second as 23:59:60, we can spell it as > 23:59:59(repeated) and represent as time(23, 59, 59, first=False). > well, yes and no -- we can do whatever we want internally, but if we're going to print out the datetime in iso 8601 format, or whatever, shouldn't we do it offical UTC way??? And the 60 second thing is actually easier, not harder (Not ambiguous). > This is nothing but notation. (Note that there is no spelling for > these times as POSIX "seconds since EPOCH", so if we try to map them > to POSIX timestamps we will have to repeat a second.) well, there is the POSIX problem no matter how it's handled. > A timezone that > implements 23:59:59 + 1s = 23:59:60 and 23:59:60 + 1s = 00:00:00 can > just as easily implement 23:59:59 + 1s = 23:59:59(repeated) and > 23:59:59(repeated) + 1s = 00:00:00. > Maybe I'm getting hung up on purity over practicality -- but this sure seem slike logic that should not be in the timezone object -- it has nothing to do with time zones. (I suppose the logic can go in the string_tzinfo base class, and it's easily shared, but it sure does feel like the wrong place to put it.) And since I don't think anyone is actually proposing to implement this yet anyway, why not keep tzinfo clean -- and put leap seconds in datetime if and when it's decided that that's a good idea. -CHB The advantage of the flag over extending the range of minutes is that > the added flag will be ignored by the older programs that will > continue to function producing results that may be off by 1s, but will > not crash with "ValueError: second must be in 0..59". > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Aug 3 18:59:26 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 3 Aug 2015 11:59:26 -0500 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: <6315761166972360255@unknownmsgid> References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <55BE7EC3.2020000@stoneleaf.us> <55BE89BD.7010002@stoneleaf.us> <6315761166972360255@unknownmsgid> Message-ID: [Ethan] >> I'm happy with a strict tzinfo at least having the ability >> to deal with leap seconds -- then those that care can >> make it work. [Chris Barker] > I'm a bit confused about what the strict tzinfo object has to do with > leap seconds. Not much, except to the extent someone may want datetime arithmetic to reflect leap second reality. Using tzstrict is the suggested way to spell "I want timeline arithmetic, not classic arithmetic". Whether arithmetic accounts for leap seconds is the same issue, mutatis mutandis, as whether arithmetic accounts for DST transitions, and also whether arithmetic accounts for changes to a timezone's standard UTC offset. > Leap seconds are more like leap years than DST transitions. (Except > that we don't know when they will occur in the future). An extra > second (:60) is added, rather than repeating one. So it is never > ambiguous what is meant. This, alas, is off. First, "leap seconds" may also be removed, although that hasn't yet been done. The Earth's rotation doesn't always slow down - sometimes it speeds up. Nobody knows - and nobody _can_ know now - whether it will ever speed up enough long enough to trigger a leap second removal. Second, adding a leap second, adding an hour (at the end of DST), and adding a day in a leap year are all very much the same.; Adding an hour at the end of DST causes a "repeated hour" on any clock restricted to showing 24 hours per day. In exactly the same way, adding a leap second causes a "repeated second" on any clock restricted to showing 60 seconds per minute. And adding a leap day _would_ cause a "repeated day" on any calendar restricted to showing only 28 days in February. The only real difference is that _all_ calendars _do_ show 29 days in a leap year February, _no_ clocks show 25 hours in an end-of-DST day, and _almost no_ clocks show 61 seconds in a leap-second-added minute. > And all time zones handle them the same, yes? Eh - probably all the ones people have in mind. But, e.g., TAI ("atomic time") has no concept of leap seconds, DST transitions, changes of standard offset, or any other kind of adjustment. Whether someone wants to call that a "time zone", and go on try to write a tzinfo class to model it, is up to them. > So consideration for leap seconds belongs in with the code that > handles converting the Gregorian calendar representation to/from a > timespan. "Arithmetic", but all kinds of transitions are uniform in this respect - there's nothing special in this respect about leap seconds. > The other consideration is that time parsing code can't barf on a 60th > second, just like it can't barf on feb 29th. Sure it can. It's not even possible to know whether "second=60" makes any sense for infinitely ;-) many datetimes in the future. It's only theoretically possible to know for datetimes no later than about 6 months in the future (although that safe window on the future will relentlessly shrink over time). There's a delightful eternity of bikeshedding yet to be endured here ;-) > All this code is in datetime, isn't it? There are ways to parse strings all over the place, including in user-written code we know nothing about. OS services and the std C libraries have their own ideas about this stuff, and Python exposes some of what they supply too. > Please tell me there are no leap seconds in the middle of DST transitions! Leap-second adjustments occur at a fixed time in UTC, so each such adjustment occurs in every possible hour on local clocks. So it's theoretically possible to coincide with some location's DST rules. I don't know whether any such collisions have actually occurred. I doubt it. So far leap second transitions have occurred only at the end of the last days of a June or December, which I guess (but don't know) are far away from all DST transitions currently in use. But if this insane scheme isn't changed, we'll _eventually_ need to suffer at least one leap-second adjustment every day (well, "we" in the sense of humanity - you and I will be blissfully dead by then ;-) ). > .... > By the way, I've generally been of the opinion that leap seconds > didn't matter to me, or most people. But apparently there have been 26 > leap seconds since 1972. That's actually quite a bit, even for people > that aren't doing atomic physics.... It doesn't matter at all to most people or applications: most only care what the local clock says, and couldn't care less how many SI seconds separate "now" from, e.g., "same time next year". Given the history to date of leap second adjustments, the difference between "now" and "same time next year" differs by at most one second (comparing timeline/strict subtraction to classic/naive subtraction). Applications that need reliable, surprise-free accounting of durations over arbitrarily long spans should use TAI: that's what it was designed for. It's a relentlessly uniform running count of elapsed SI seconds. TAI also has a representation in an idealized Gregorian calendar, which is pretty much identical to datetime's notion of "naive time". UTC is currently _defined_ as an offset to TAI, but the offset changes over time in unpredictable was (via "leap seconds"). Note that the definition of UTC changes over time too. Currently the TIA "second" and the UTC "second" are identical durations, but in earlier days the duration of a UTC second changed over time too, so that leap seconds _weren't_ needed to keep UTC time a good approximation to notions of solar time. So it's flatly insane in a scientific context concerned with durations to use something other than TAI. What a UTC datetime "really means" has changed over time, and will almost certainly change again. Times recorded in TAI are intended to have the same meaning until the universe ends. "Doctor! Doctor! It hurts when I do this!" ... From alexander.belopolsky at gmail.com Mon Aug 3 19:34:36 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 3 Aug 2015 13:34:36 -0400 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <55BE7EC3.2020000@stoneleaf.us> <55BE89BD.7010002@stoneleaf.us> <6315761166972360255@unknownmsgid> Message-ID: On Mon, Aug 3, 2015 at 12:49 PM, Chris Barker wrote: > >> >> A timezone that >> implements 23:59:59 + 1s = 23:59:60 and 23:59:60 + 1s = 00:00:00 can >> just as easily implement 23:59:59 + 1s = 23:59:59(repeated) and >> 23:59:59(repeated) + 1s = 00:00:00. > > > Maybe I'm getting hung up on purity over practicality -- but this sure seem > slike logic that should not be in the timezone object -- it has nothing to > do with time zones. It has everything to do with timezones. The information about historical addition of leap seconds is and will be distributed with the timezone databases. This is what Olson's distribution has always done and what IANA intends to continue. See [1] for more details. [1]: https://tools.ietf.org/html/draft-ietf-tzdist-service-11#section-5.6 From alexander.belopolsky at gmail.com Mon Aug 3 19:54:31 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 3 Aug 2015 13:54:31 -0400 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <55BE7EC3.2020000@stoneleaf.us> <55BE89BD.7010002@stoneleaf.us> <6315761166972360255@unknownmsgid> Message-ID: On Mon, Aug 3, 2015 at 12:49 PM, Chris Barker wrote: > if we're going to print out the datetime in iso 8601 format, or whatever, > shouldn't we do it offical UTC way??? Maybe we should, but we don't. Try the following on a Linux system: $ TZ=right/UTC python >>> datetime.datetime.utcfromtimestamp(1341100823).isoformat() '2012-06-30T23:59:59' >>> datetime.datetime.utcfromtimestamp(1341100824).isoformat() '2012-06-30T23:59:59' >>> datetime.datetime.utcfromtimestamp(1341100825).isoformat() '2012-07-01T00:00:00' From tim.peters at gmail.com Mon Aug 3 20:00:51 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 3 Aug 2015 13:00:51 -0500 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <55BE7EC3.2020000@stoneleaf.us> <55BE89BD.7010002@stoneleaf.us> <6315761166972360255@unknownmsgid> Message-ID: [Chris Barker] >> Maybe I'm getting hung up on purity over practicality -- but this sure seem >> slike logic that should not be in the timezone object -- it has nothing to >> do with time zones. [Alexander Belopolsky] > It has everything to do with timezones. The information about > historical addition of leap seconds is and will be distributed with > the timezone databases. This is what Olson's distribution has always > done and what IANA intends to continue. See [1] for more details. > > [1]: https://tools.ietf.org/html/draft-ietf-tzdist-service-11#section-5.6 A "timezone" _now_ should really be defined as an offset from TAI, the one remaining scheme that remains - and always will remain - utterly uniform. Then it would make perfect sense to everyone that the history of UTC leap seconds, which are changes to UTC's offset from TAI, belong in timezone logic. Then again, everything looks simple to me - LOL ;-) From alexander.belopolsky at gmail.com Mon Aug 3 20:07:51 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 3 Aug 2015 14:07:51 -0400 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <55BE7EC3.2020000@stoneleaf.us> <55BE89BD.7010002@stoneleaf.us> <6315761166972360255@unknownmsgid> Message-ID: On Mon, Aug 3, 2015 at 2:00 PM, Tim Peters wrote: > A "timezone" _now_ should really be defined as an offset from TAI Unfortunately, they will first have to rename it to ITA as a compromise between French TAI and English IAT and since there are three more permutations of these three letters, this problem will stay forever in the subcommittees. From alexander.belopolsky at gmail.com Mon Aug 3 20:29:26 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 3 Aug 2015 14:29:26 -0400 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: <6315761166972360255@unknownmsgid> References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <55BE7EC3.2020000@stoneleaf.us> <55BE89BD.7010002@stoneleaf.us> <6315761166972360255@unknownmsgid> Message-ID: On Mon, Aug 3, 2015 at 11:32 AM, Chris Barker - NOAA Federal wrote: > I'm a bit confused about what the strict tzinfo object has to do with > leap seconds. Here is another way to explain it. Suppose I want to know how many seconds passed from 2011-07-15T12:00 to (and not including) 2012-07-15T12:00. Well, simple: it's one year, which is 365 days which is 365*24 hours, which is ... >>> 365*24*60*60 31536000 Simple, but wrong - we forgot that 2012 is a leap year. Try again >>> from datetime import * >>> SECOND = timedelta(seconds=1) >>> (datetime(2012, 7, 15, 12) - datetime(2011, 7, 15, 12)) / SECOND 31622400.0 This is better, but what if we are in a location that stopped observing DST in 2012? This is what we want tzstrict to account for, but if we just account for DST, we will still be wrong g because there was a leap second added at the end of June 2012. Luckily the same mechanism can be used to account for that. Note that in answering this question, we did not have to know how to call the extra day in February, the extra hour in March or April or the extra second in June. All we needed was a mapping from the start and end point to some scale that is considered linear (UTC for most practical purposes or TAI for those who care.) From tim.peters at gmail.com Mon Aug 3 22:25:49 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 3 Aug 2015 15:25:49 -0500 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: <55BE66DA.7010206@stoneleaf.us> References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> Message-ID: [Ethan Furman] > I don't know whether we should add support for leap-seconds to the new > strict tzinfo, but I will mention that the proposal to get rid of them keeps > being postponed, But that won't last forever. If nothing changes, the frequency of leap second adjustments will eventually become unbearable, falling from today's "about once per 18 months" through "about once per year" through "about once per month" through "about once per day" ... and even now, the longer change is delayed, the more countries have decided to support the proposal(s) to abolish leap seconds. That multiple US government agencies support the change now is a Big Deal. Russia continuing to oppose it also a Big Deal. Nothing a nuclear war couldn't sort out ;-) > and leap seconds are actively harming the business community precisely because > they are not being accounted for. Eh. The reason Google's "smear" works so well is precisely because it hides the existence of leap seconds entirely from billions of lines of code: - No code ever sees a repeated second. - No code ever sees a missing second. - No code ever sees a second outside of [0, 1, 2, ..., 59]. In return, code _does_ sometimes see that one second isn't the same duration as the next second, but the difference is small enough from second to second that almost no code cares one whit about that. Code that does care is highly exceptional, has always had to worry about tons of obscure stuff, and is maintained by domain experts paid well to deal with it all. Alas, to be so effective, that has to be implemented inside the OS, so that almost all ways of asking "what time is it now?" are equally oblivious to leap seconds. From guido at python.org Mon Aug 3 22:47:19 2015 From: guido at python.org (Guido van Rossum) Date: Mon, 3 Aug 2015 22:47:19 +0200 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> Message-ID: Is there a way to end the discussion of leap seconds by fiat? Or maybe someone can create leap-seconds-sig where those so inclined can continue the discussion? The handling of leap seconds by the stdlib isn't going to change in the forseeable future (let's say until 2038 :-). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Aug 3 22:50:07 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 3 Aug 2015 15:50:07 -0500 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> Message-ID: [Guido] >> ... >> I like the idea of using a special tzinfo to reveal the leap seconds for >> those who really want them. (And we won't have to provide such a tzinfo -- >> it's enough that one could be written, given a table of leap seconds.) [?ukasz Rekucki] > But if we don't write a one, how do we know it's possible to do? Because it's a shallow problem: not difficult, just tedious. We _are_ proposing that tzstrict implement timeline arithmetic across DST transitions. Timeline arithmetic will work correctly for that if and only if it will also work correctly for leap seconds. They're exactly the same problem, it's just that one works naturally in units of minutes while the other works in units of seconds. None of the code cares about that, since - whatever it _looks_ like from the outside - all arithmetic in datetime is "really" working with microseconds. > IMHO, the same approach was taken for tzinfo and DST and it didn't > work out very well. Very different. It was known from the start - and documented from the start - that datetime supplied no way to distinguish between the ambiguous times at the end of DST. It wasn't anticipated that some people would care so much about this non-problem ;-) that they'd make heroic efforts to supply hacks to work around it. That the hacks _are_ so obviously hacks isn't their fault, because they're trying to accomplish something the design never intended to support. Alexander's "first" flag supplies the one bit of support that was missing from the start, and it's equally applicable to ambiguities due to leap seconds as to those due to DST transitions. > For example, I probably could implement a timezone which represents > TAI (assuming I have always up-to-date leap seconds table), but it is > not possible because tzinfo.utcoffset() requires me to return an > integer number of minutes. Yup. There's nothing in the code that really gives a rip about utcoffset() returns - any number of microseconds would work as well. Restricting it to a multiple of minutes with magnitude strictly less than 60*24 was simply intended to be an aid in catching programming errors. Relaxing that restriction is backward-compatible (except for code that's deliberately trying to provoke this exception). > Most times you need leap seconds, is to ignore them (as Tim described > in his trading stock example). Adding support to a single programming > language and fixing it every application, won't make the world better. > Instead, you can use UTC-SLS or do what Google did. > > [1]: http://www.cl.cam.ac.uk/~mgk25/time/utc-sls/ The people who clamor for leap seconds are of two kinds: 1. People who want "purity" because that's what they always want, and regardless of whether it makes any practical difference to them. 2. People who legitimately care about exactly how many SI seconds separate two datetimes. They ought to consider using TAI instead (where this is trivial to determine via code that's direct and obviously correct), but in principle it's possible to get that too from UTC datetimes.(although, if this were my concern, I'd always worry about whether a transformation so complex was correctly implemented by the library I was using! The more critical the application, the greater the value of "obviously correct" over "not obviously broken"). From tim.peters at gmail.com Mon Aug 3 22:51:57 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 3 Aug 2015 15:51:57 -0500 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> Message-ID: [Guido] > Is there a way to end the discussion of leap seconds by fiat? Or maybe > someone can create leap-seconds-sig where those so inclined can continue the > discussion? The handling of leap seconds by the stdlib isn't going to change > in the forseeable future (let's say until 2038 :-). +1. That means it conveniently ends here with my last message on the topic, overlapping yours by seconds. I said everything there that needs to be said ;-) From lrekucki at gmail.com Mon Aug 3 23:50:29 2015 From: lrekucki at gmail.com (=?UTF-8?Q?=C5=81ukasz_Rekucki?=) Date: Mon, 3 Aug 2015 23:50:29 +0200 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> Message-ID: On Monday, August 3, 2015, Tim Peters wrote: > [Guido] > > Is there a way to end the discussion of leap seconds by fiat? Or maybe > > someone can create leap-seconds-sig where those so inclined can continue > the > > discussion? The handling of leap seconds by the stdlib isn't going to > change > > in the forseeable future (let's say until 2038 :-). Thank you :) > > +1. That means it conveniently ends here with my last message on the > topic, overlapping yours by seconds. I said everything there that > needs to be said ;-) _______________________________________________ > Datetime-SIG mailing list > Datetime-SIG at python.org > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > -- ?ukasz Rekucki -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Tue Aug 4 00:15:10 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 3 Aug 2015 18:15:10 -0400 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> Message-ID: On Mon, Aug 3, 2015 at 4:47 PM, Guido van Rossum wrote: > The handling of leap seconds by the stdlib isn't going to change in the > forseeable future (let's say until 2038 :-). Until 2038-01-19T03:14:08 UTC+00:00 ? 26 leap seconds to be precise. From tim.peters at gmail.com Tue Aug 4 06:10:39 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 3 Aug 2015 23:10:39 -0500 Subject: [Datetime-SIG] Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> Message-ID: Sorry about this. I don't want to talk about leap seconds anymore, but I left a bad impression that intersects with what we _are_ doing. [Guido] >>> ... >>> I like the idea of using a special tzinfo to reveal the leap seconds for >>> those who really want them. (And we won't have to provide such a tzinfo -- >>> it's enough that one could be written, given a table of leap seconds.) [?ukasz Rekucki] >> But if we don't write a one, how do we know it's possible to do? [Tim] > Because it's a shallow problem: not difficult, just tedious. I should expound on _how_ tedious ;-) The basic template for doing timeline arithmetic is: 1. Convert the datetime(s) to UTC. 2. Do straightforward classic arithmetic in UTC. 3. Convert back to original timezone (except for datetime-datetime subtraction, in which case we're done). This works because, in step #2, classic arithmetic is the same as timeline arithmetic. However, that relies on that Python's current implementation of UTC ignores leap seconds. The approximated UTC is an instance of "naive time". But in a world with UTC leap seconds, that's no longer true. In that world, UTC doesn't match naive time, and classic arithmetic can't work. More is needed then, to move to another kind of time bijectively related to UTC that _does_ match naive time (so that classic arithmetic is correct). Since real-life UTC is defined as an offset (well, a collection of pattern-less offsets) from TAI, and TAI is an instance of naive time, that's the obvious choice. So add: 1.5. Use the frickin' leap second table to convert to TAI. and, in 2, s/UTC/TAI/, and 2.5 Use the frickin' leap second table to convert back to UTC. From tim.peters at gmail.com Tue Aug 4 08:14:12 2015 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 4 Aug 2015 01:14:12 -0500 Subject: [Datetime-SIG] Local time disambiguation proposal In-Reply-To: References: Message-ID: [Alexander Belopolsky] > ... > I am also working on converting my write-up [1] for publication as a > PEP. Please note that I have made substantial changes since my > initial post, so if you comment on the proposal please make sure that > you've read the latest version. > > [1]: https://github.com/abalkin/ltdf/blob/master/README.rst Since that appears to be obsolete now, here are some comments on: https://github.com/abalkin/ltdf/blob/master/pep-0495.txt instead. """ The ``replace()`` methods of the ``datetime.time`` and ``datetime.datetime`` classes will get a new keyword-only argument called ``first`` with the default value ``True`` """ That's dubious: .replace() treats all other missing arguments as meaning "copy this attribute's current value". I don't see why `.first` should be different. If there's a strong reason (I don't see one) for dt2 = dt1.replace(minutes=2) to force dt2.first to True when dt1.first is False, it would be better to raise an exception than change the value silently. """ The ``timestamp()`` method of ``datetime.datetime`` will return value advanced by 3600 if ``self`` represents an ambiguous hour and ``first`` is False. """ Are all known DST adjustments, and changes to standard UTC offsets -- in, say, zoneinfo -- exactly one hour? For example, I read this on the web, so it must be true: "Lord Howe Island (Australia) advances its clocks by half an hour in the summer" ;-) Since I expect we expect to support all the goofy timezones in zoneinfo, best to get that right from the start. """ The value of "first" will be ignored in all operations except those that involve conversion between timezones. """ "Involve" is vague. Subtraction and comparison can "involve" conversion to UTC, at least conceptually, when two datetimes don't share a tzinfo member. It's explicitly stated later that: """ Comparison ---------- Instances of ``datetime.time`` and ``datetime.datetime`` classes that differ only by the value of their ``first`` attribute will compare as equal. """ but it's not explicitly stated that subtraction of such instances will return timedelta(0). If there's a reason to explicitly point out one, then both should be mentioned, since comparison actually inherits its behavior from subtraction. The real reason the cases named will always compare as equal is that they share a common tzinfo member (or both have none). That short-circuits the conceptional conversion to UTC before comparison. But if you plug a workalike tzinfo member into one of them (same timezone represented by a distinct tzinfo object), then the results of comparison (and subtraction) _may_ change "just because" .first differs between them. That all depends on what .utcoffset() returns. In the text about comparisons, I expect you're just trying to say that .first isn't used as a tie breaker, as if datetime comparisons were akin to tuple comparisons, comparing field by field. But that's not how it's done, and there's no real point to explaining how a model that didn't apply to begin with would be affected if it had applied to begin with ;-) So I'd drop the text about comparisons. But I'd add some text explaining that, while this PEP isn't aiming at timeline arithmetic, some cases of subtraction and comparison, which have used timeline arithmetic all along, will return different results now due to .utcoffset() returning different results now. To make the reader happy, you could even mention that these different results are also correct now ;-) From ischwabacher at wisc.edu Tue Aug 4 17:47:58 2015 From: ischwabacher at wisc.edu (ISAAC J SCHWABACHER) Date: Tue, 04 Aug 2015 15:47:58 +0000 Subject: [Datetime-SIG] Local time disambiguation proposal In-Reply-To: References: Message-ID: [Alexander Belopolsky] > > ... > > I am also working on converting my write-up [1] for publication as a > > PEP. Please note that I have made substantial changes since my > > initial post, so if you comment on the proposal please make sure that > > you've read the latest version. > > > > [1]: https://github.com/abalkin/ltdf/blob/master/README.rst [Tim Peters] > Since that appears to be obsolete now, here are some comments on: > > https://github.com/abalkin/ltdf/blob/master/pep-0495.txt > > instead. > > """ > The ``replace()`` methods of the ``datetime.time`` and > ``datetime.datetime`` classes will get a new keyword-only argument > called ``first`` with the default value ``True`` > """ > > That's dubious: .replace() treats all other missing arguments as > meaning "copy this attribute's current value". I don't see why > `.first` should be different. If there's a strong reason (I don't see > one) for > > dt2 = dt1.replace(minutes=2) > > to force dt2.first to True when dt1.first is False, it would be better > to raise an exception than change the value silently. [ijs] Not so long ago I think I finally got a point that Tim has been dancing around throughout this whole discussion, without quite saying it outright. (Another possibility would be Reading Comprehension Fail on my part...) Classic arithmetic and `replace()` are low level operations on datetimes, whereas duration and period arithmetic and time zone conversion are high level ones. The original design of datetime was to expose all of the low level details so that people could implement their own high level stuff, because we're programmers and that's what we do, ya know? All of this is to say that I agree with Tim that `first=True` is not a good default for `replace()`. While that method verifies that all of its numerical arguments are within their valid ranges, it doesn't verify that the resulting time exists in its time zone, so for consistency I would expect that it simply keep the value of the `first` flag without validation or modification if that argument is not provided. But this raises the question of how an unambiguous datetime with first=False should be handled by other code. My preference is that all high level operations should treat such datetimes as having first=True. Also, given this divide, it would be good to document in the datetime module which methods are high level and which are low. > """ > The ``timestamp()`` method of ``datetime.datetime`` will return value > advanced by 3600 if ``self`` represents an ambiguous hour and > ``first`` is False. > """ > > Are all known DST adjustments, and changes to standard UTC offsets -- > in, say, zoneinfo -- exactly one hour? For example, I read this on > the web, so it must be true: "Lord Howe Island (Australia) advances > its clocks by half an hour in the summer" ;-) Since I expect we > expect to support all the goofy timezones in zoneinfo, best to get > that right from the start. Also relevant is the fact that this rationale fails in such cases: """ We chose the minute byte to store the the "first" bit because this choice preserves the natural ordering. """ > """ > The value of "first" will be ignored in all operations except those > that involve conversion between timezones. > """ > > "Involve" is vague. Subtraction and comparison can "involve" > conversion to UTC, at least conceptually, when two datetimes don't > share a tzinfo member. It's explicitly stated later that: > > """ > Comparison > ---------- > Instances of ``datetime.time`` and ``datetime.datetime`` classes that > differ only by the value of their ``first`` attribute will compare as > equal. > """ > > but it's not explicitly stated that subtraction of such instances will > return timedelta(0). If there's a reason to explicitly point out one, > then both should be mentioned, since comparison actually inherits its > behavior from subtraction. > > The real reason the cases named will always compare as equal is that > they share a common tzinfo member (or both have none). That > short-circuits the conceptional conversion to UTC before comparison. > But if you plug a workalike tzinfo member into one of them (same > timezone represented by a distinct tzinfo object), then the results of > comparison (and subtraction) _may_ change "just because" .first > differs between them. That all depends on what .utcoffset() returns. > > In the text about comparisons, I expect you're just trying to say that > .first isn't used as a tie breaker, as if datetime comparisons were > akin to tuple comparisons, comparing field by field. But that's not > how it's done, and there's no real point to explaining how a model > that didn't apply to begin with would be affected if it had applied to > begin with ;-) > > So I'd drop the text about comparisons. But I'd add some text > explaining that, while this PEP isn't aiming at timeline arithmetic, > some cases of subtraction and comparison, which have used timeline > arithmetic all along, will return different results now due to > .utcoffset() returning different results now. To make the reader > happy, you could even mention that these different results are also > correct now ;-) I agree with dropping the part about comparisons, since that will no longer be true for datetimes with tzstrict time zones if that part of the discussion comes to fruition. ijs From alexander.belopolsky at gmail.com Tue Aug 4 18:06:53 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 4 Aug 2015 12:06:53 -0400 Subject: [Datetime-SIG] Local time disambiguation proposal In-Reply-To: References: Message-ID: On Tue, Aug 4, 2015 at 2:14 AM, Tim Peters wrote: > [Alexander Belopolsky] >> ... make sure that you've read the latest version. >> >> [1]: https://github.com/abalkin/ltdf/blob/master/README.rst > > Since that appears to be obsolete now, here are some comments on: > > https://github.com/abalkin/ltdf/blob/master/pep-0495.txt > > instead. I made README.rst a symbolic link to pep-0495.txt, so the PEP will get rendered on the main project page: https://github.com/abalkin/ltdf Unfortunately I cannot make the old link work as before. > > """ > The ``replace()`` methods of the ``datetime.time`` and > ``datetime.datetime`` classes will get a new keyword-only argument > called ``first`` with the default value ``True`` > """ > > That's dubious: .replace() treats all other missing arguments as > meaning "copy this attribute's current value". I don't see why > `.first` should be different. Right. I updated the text: https://github.com/abalkin/ltdf/commit/2fa9e6bfeefec61bf642bec658149ba55a348982 > """ > The ``timestamp()`` method of ``datetime.datetime`` will return value > advanced by 3600 if ``self`` represents an ambiguous hour and > ``first`` is False. > """ > > Are all known DST adjustments, and changes to standard UTC offsets -- > in, say, zoneinfo -- exactly one hour? For example, I read this on > the web, so it must be true: "Lord Howe Island (Australia) advances > its clocks by half an hour in the summer" ;-) Since I expect we > expect to support all the goofy timezones in zoneinfo, best to get > that right from the start. I agree. In addition to your example, I found some cases where standard time was rolled back: Date Time Zone Offset_before Offset_after 1942-01-28 16:30:00 Asia/Pontianak 9.0 7.5 1942-02-15 16:30:00 Asia/Kuala_Lumpur 9.0 7.5 1942-02-15 16:30:00 Asia/Singapore 9.0 7.5 1942-03-22 16:30:00 Asia/Jakarta 9.0 7.5 1942-04-30 17:30:00 Asia/Rangoon 9.0 6.5 1942-08-31 18:30:00 Asia/Dacca 6.5 6.0 1942-08-31 18:30:00 Asia/Dhaka 6.5 6.0 1944-08-31 15:00:00 Asia/Jayapura 9.5 9.0 1945-09-07 15:00:00 Asia/Seoul 9.0 8.5 1945-10-14 17:30:00 Asia/Karachi 5.5 5.0 1948-04-30 16:30:00 Asia/Jakarta 8.0 7.5 1948-04-30 16:30:00 Asia/Pontianak 8.0 7.5 1950-04-30 16:00:00 Asia/Jakarta 7.5 7.0 1965-01-01 04:30:00 America/Caracas -4.0 -4.5 1965-10-31 04:30:00 America/Goose_Bay -3.5 -4.0 1978-10-20 19:00:00 Asia/Tehran 4.0 3.5 1996-05-24 18:30:00 Asia/Colombo 6.5 6.0 1996-10-25 18:00:00 Asia/Colombo 6.0 5.5 I think, for practical purposes we will allow implementations without direct access to the timezone data to assume that "folds" can only happen on a 15 min boundary and be a multiple of 15 min in duration, but I rewrote that section now. https://github.com/abalkin/ltdf/commit/aa63f7d313ad245c7ca1e8713b07a8799112a6f5 I agree with your remaining points and will update the PEP accordingly. > > """ > The value of "first" will be ignored in all operations except those > that involve conversion between timezones. > """ > > "Involve" is vague. Subtraction and comparison can "involve" > conversion to UTC, at least conceptually, when two datetimes don't > share a tzinfo member. It's explicitly stated later that: > > """ > Comparison > ---------- > Instances of ``datetime.time`` and ``datetime.datetime`` classes that > differ only by the value of their ``first`` attribute will compare as > equal. > """ > > but it's not explicitly stated that subtraction of such instances will > return timedelta(0). If there's a reason to explicitly point out one, > then both should be mentioned, since comparison actually inherits its > behavior from subtraction. > > The real reason the cases named will always compare as equal is that > they share a common tzinfo member (or both have none). That > short-circuits the conceptional conversion to UTC before comparison. > But if you plug a workalike tzinfo member into one of them (same > timezone represented by a distinct tzinfo object), then the results of > comparison (and subtraction) _may_ change "just because" .first > differs between them. That all depends on what .utcoffset() returns. > > In the text about comparisons, I expect you're just trying to say that > .first isn't used as a tie breaker, as if datetime comparisons were > akin to tuple comparisons, comparing field by field. But that's not > how it's done, and there's no real point to explaining how a model > that didn't apply to begin with would be affected if it had applied to > begin with ;-) > > So I'd drop the text about comparisons. But I'd add some text > explaining that, while this PEP isn't aiming at timeline arithmetic, > some cases of subtraction and comparison, which have used timeline > arithmetic all along, will return different results now due to > .utcoffset() returning different results now. To make the reader > happy, you could even mention that these different results are also > correct now ;-) From chris.barker at noaa.gov Tue Aug 4 18:13:58 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Tue, 4 Aug 2015 09:13:58 -0700 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: <7803688847893344661@unknownmsgid> References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> Message-ID: > On Aug 3, 2015, at 9:11 PM, Tim Peters wrote: > > Sorry about this. I don't want to talk about leap seconds anymore, Nor do I, but TL;DR -- leap seconds are in the definition of UTC. And there are use cases (see below) that python may want to support at some point. So it would be good to have infrastructure in place so that someone could implement it in the future (probably as a third party lib). Now, if you actually want to read more about this: > The basic template for doing timeline arithmetic is: > > 1. Convert the datetime(s) to UTC. > 2. Do straightforward classic arithmetic in UTC. > 3. Convert back to original timezone (except for datetime-datetime > subtraction, in which case we're done). > > This works because, in step #2, classic arithmetic is the same as > timeline arithmetic. However, that relies on that Python's current > implementation of UTC ignores leap seconds. The approximated UTC is an > instance of "naive time". If I have this right, the datetime object encodes time using the proleptic Gregorian calendar, with no leap seconds -- yes? Is that what TAI time is? From: http://www.timeanddate.com/time/international-atomic-time.html It appears so. > TAI > is an instance of naive time, that's the obvious choice. So add: > > 1.5. Use the frickin' leap second table to convert to TAI. > > and, in 2, s/UTC/TAI/, and > > 2.5 Use the frickin' leap second table to convert back to UTC. Exactly. So for now, we simply document that Naive time IS TAI time, and that the to-from UTC methods on the tzinfo objects are actually to-from TAI ( renaming them is probably out of the question, though) Then if anyone ever wants to implement leap seconds (I.e. a proper UTC), then they create a UTC tzinfo object that converts properly to-from TAI with that frikin' leap second table. ( and all the other time zones to match...). I still think that the base datetime object should be UTC, which would include leap seconds, but it's too late for that. (hmm -- could a user simply plug in a new datetime object, instead of a new tzinfo objects? but anyway, as long as the door is open one way or another, then we're good for now. Use case: In the netcdf CF metadata standard: http://cfconventions.org Datetimes are encoded as "some time span since an epoch", i.e. "seconds since 1970-01-01T0:0:0". The epoch can be anything, and the units can be any well-defined reasonable unit -- seconds, hours, days. Then elsewhere, you can specify the Calendar used. And this gets ugly because there are some weird ones -- for instance, some climate models use a 360 day year! There was recently a big huge discussion about how to specify the variations of the Gregorian calendar - whether leap seconds are in play or not, etc. This is uglier than it should be because a lot (most?) time libraries don't do leap seconds, but many people _think_ they are using UTC, when they are not. And for the most part, it's close enough. When using this encoding, it would be smart to use an epoch close to the time of your data / model results -- for instance, the zeroth time step of your model. And some people do this. In that case, you're likely to only cross maybe one or two leap-seconds. But in practice, a lot of folks encode their time using an epoch farther away from their data -- sometimes Unix epoch (that makes sense), and sometimes ones that seem totally arbitrary to me (maybe their time library uses it?). In this case, when the times are encoded back to calendar representation, they are often off by 26 seconds if a time lib that doesn't support leap-seconds is used. In practice, even 26 seconds probably doesn't matter in most cases, but when you are working with datetimes (like Python's) with a resolution of microseconds, errors that large are perceived to be significant, and can lead to actual errors (like getting time steps from different sources out of order), if the code is not written carefully to account for the limited precision of the inputs. OK -- I'm done with leap seconds. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Tue Aug 4 18:29:24 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 4 Aug 2015 12:29:24 -0400 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> Message-ID: On Tue, Aug 4, 2015 at 12:13 PM, Chris Barker wrote: >> On Aug 3, 2015, at 9:11 PM, Tim Peters wrote: >> >> Sorry about this. I don't want to talk about leap seconds anymore, > > Nor do I, but > > TL;DR -- leap seconds are in the definition of UTC. I have not reflected this in the PEP, but in the reference implementation, I am toying with the following bit of "leap seconds support": diff --git a/Lib/datetime.py b/Lib/datetime.py .. - ss = min(ss, 59) # clamp out leap seconds if the platform has them - return cls(y, m, d, hh, mm, ss, us) + ss, first = (ss, True) if ss < 60 else (59, False) + return cls(y, m, d, hh, mm, ss, us, first=first) .. See . Note that even though many people claim that POSIX does not support leap seconds, this is only true with respect to the time_t timestamps and does not apply to the broken down tm structure which is more directly related to Python datetime than "seconds since EPOCH." To the contrary, leap seconds support is mandated by POSIX in the case of struct tm: """ The header shall declare the structure tm, which shall include at least the following members: int tm_sec Seconds [0,60]. ... The range [0,60] for tm_sec allows for the occasional leap second. """ See . From alexander.belopolsky at gmail.com Tue Aug 4 20:33:37 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 4 Aug 2015 14:33:37 -0400 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... Message-ID: [Tim Peters] >> TAI >> is an instance of naive time, that's the obvious choice. So add: >> >> 1.5. Use the frickin' leap second table to convert to TAI. >> >> and, in 2, s/UTC/TAI/, and >> >> 2.5 Use the frickin' leap second table to convert back to UTC. [Chris Barker] > > Exactly. So for now, we simply document that Naive time IS TAI time, > and that the to-from UTC methods on the tzinfo objects are actually > to-from TAI ( renaming them is probably out of the question, though) I this is not the POV that I've been advocating. Since the subject of this thread is "Calendar vs timespan calculations", let me discuss a use case that is hopefully more familiar: compute the number of days from 1900-02-01 to 1900-03-01: >>> date(1900,3,1) - date(1900,2,1) datetime.timedelta(28) Python's answer (28) is correct in the most of the Western world, but in Greece, there were 29 days from 1900-02-01 to 1900-03-01 because they did have 1900-02-29. (Greece did not switch to the Gregorian calendar until 1923.) Note that "1900-02-01" and "1900-03-01" are just names for two historical days which had different meanings in different parts of the world. The digits that appear in those names do not necessarily mean anything arithmetically. The first day of February does not have to be in a constant relationship to the first day of March any more than the first day of Hanukkah to the first day of Ramadan. While it is convenient to have a universal bijection between days and the (additive group of) integers, it is not a prerequisite for being able to compute the number of days between two dates. For example, the hypothetical >>> julian_days("1900-02-01", "1900-03-01") 29 >>> gregorian_days("1900-02-01", "1900-03-01") 28 can be implemented without any conversion to the common scale. (For example, we can have a large list of julian days from "1000-01-01" to "9000-01-01" and another large list of gregorian days from "0006-06-06" to "9999-09-09" and implement our functions using a binary search into these lists. Not an efficient algorithm, but universal enough to cover calendars that use names of the living Emperors instead of years.) My main point is that as long as we can spell two dates in a way that is understood in some part of the world, we can have a software module that can tell the number of days (or seconds) between these two dates as long as it has accurate enough information about timekeeping practices in that location. This software module may or may not operate by converting to any well-known integer scale. My specific proposal can be summarized in the following pseudocode: class datetime: ... def __add__(self, other): try: add = self.tzinfo.add except AttributeError: # old logic else: return add(self, other) def __sub__(self, other): try: sub = self.tzinfo.sub except AttributeError: # old logic else: return sub(self, other) If we do that, we can implement a HistoricalGreek timezone, so that >>> datetime(1900,3,1,tzinfo=HistoricalGreek) - date(1900,2,1,tzinfo=HistoricalGreek) datetime.timedelta(29) but we will face the problem datetime(1900,2,29,tzinfo=HistoricalGreek) will still raise "ValueError: ('day must be in 1..28', 29)". This problem can be solved by implementing HistoricalGreek.add so that >>> datetime(1900,2,28,tzinfo=HistoricalGreek) + timedelta(1) datetime .datetime(1900,2,29,tzinfo=HistoricalGreek,first=False) Now, if we also want "next day after 1900-02-28 in Greece" to print nicely as "1900-02-29", we should also arrange that isoformat() is also delegated to HistoricalGreek: def isoformat(self, other): try: fmt = self.tzinfo.isoformat except AttributeError: # old logic else: return fmt(self, other) and HistoricalGreek.isoformat will then know that what comes as the "repeated February 28" is the "February 29" in disguise. The only tricky issue here is the mixed timezone arithmetics. My solution is to disallow subtraction of datetime instances that both have tzinfo implement the "sub" method, but not the same: def __sub__(self, other): try: sub = self.tzinfo.sub try: other_sub = other.tzinfo.sub except AttributeError: other_sub = sub except AttributeError: # old logic else: if sub.__func__ is not other_sub.__func__: raise ValueError("Incompatible calendars") return sub(self, other) From alexander.belopolsky at gmail.com Tue Aug 4 21:15:02 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 4 Aug 2015 15:15:02 -0400 Subject: [Datetime-SIG] Local time disambiguation proposal In-Reply-To: References: Message-ID: On Tue, Aug 4, 2015 at 2:14 AM, Tim Peters wrote: > """ > The value of "first" will be ignored in all operations except those > that involve conversion between timezones. > """ > > "Involve" is vague. Subtraction and comparison can "involve" > conversion to UTC, at least conceptually, when two datetimes don't > share a tzinfo member. I tried to address this in . From tim.peters at gmail.com Wed Aug 5 04:26:05 2015 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 4 Aug 2015 21:26:05 -0500 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> Message-ID: [Tim] >> Sorry about this. I don't want to talk about leap seconds anymore, [Chris Barker] > Nor do I, but ... Python isn't going to support leap seconds out of the box. We (primarily meaning Alexander) _are_ trying to make it possible for someone who wants to supply an external implementation to do so, but that's not going to drive any current decision. > ... > Exactly. So for now, we simply document that Naive time IS TAI time, But it's not. You're free to think of a naive datetime as representing a TAI time if you like, but that's as far it goes. You are, e.g., equally free to think of it as being a GPS time. That's up to you (TAI and GPS time both fit Python's naive time model - but so does "Tim's personal time", which has nothing to do with durations in SI seconds). For the rest, this is the high-order bit: > ... > This is uglier than it should be because a lot (most?) time libraries don't > do leap seconds, Can you name one that does? Not a single one has been mentioned in this entire mountain of messages so far (that I saw). Even Common LISP, which can usually be counted on to be as anal as conceivable ;-) , ignores leap seconds. Which is a primary reason Python will always ignore them too: it makes Python's core calculations compatible with every other programming language's on Earth. If you need surprise-free durations in SI seconds, use TAI. If your colleagues create a Tower of Babel, each using their own definition of time, that's far more a social problem than a technical one. Agree on a standard, and - poof! - no more communication or conversion problems in this area - and if you pick TAI, the intended meaning of the data will remain clear until the last human dies ;-) There are ways to get at "real UTC" (and TAI, and UT1, and ...) from Python now, but they're in specialized packages few people know about. That's appropriate, because few people actually need them. They would confuse people who don't need them. For example, just try to picture 99.997% of users trying to make sense of the astropy docs: http://astropy.readthedocs.org/en/stable/time/index.html The astropy.time package provides functionality for manipulating times and dates. Specific emphasis is placed on supporting time scales (e.g. UTC, TAI, UT1, TDB) and time representations (e.g. JD, MJD, ISO 8601) that are used in astronomy and required to calculate, e.g., sidereal times and barycentric corrections. It's wonderful that such packages exist, but they don't belong in the core. From tim.peters at gmail.com Wed Aug 5 05:43:20 2015 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 4 Aug 2015 22:43:20 -0500 Subject: [Datetime-SIG] Local time disambiguation proposal In-Reply-To: References: Message-ID: [ijs] > Not so long ago I think I finally got a point that Tim has > been dancing around throughout this whole discussion, > without quite saying it outright. (Another possibility > would be Reading Comprehension Fail on my part...) I have to suggest another: that datetime is so far from what you would have designed that you just can't believe Guido didn't have your design in mind all along and was just too harried to implement it properly. See? I don't dance at all ;-) For example, I'm pretty sure you would have designed datetime to store a duration from an epoch. But that was never on the table. It was an explicit requirement to maintain year, month, day, hour ... attributes separately, in both the storage ("pickle") and in-memory formats. That's because conversions between that and a duration is expensive, and many use cases required quick access to the attributes. For example, business-oriented web apps typically do little (if any) time arithmetic, but are forever reading up datetimes from a database and needing to display them in human-readable form (show the attribute values). Sometimes with a time zone indicator attached. And sometimes converted to the viewer's timezone. > Classic arithmetic and `replace()` are low level operations > on datetimes, I can speak about .replace() definitively since I "invented" it: it'is just intended to be shorthand for calling a constructor in cases where "most of" the fields retain the same values. It was born of necessity, because I found an early datetime prototype unbearably tedious to use just for writing unit tests. Because datetime objects are immutable, "changing a single attribute" is horridly verbose without .replace(). > whereas duration The only intended support for durations was in "naive time". As Guido said earlier, he expected that people who needed more than that would continue using timestamps. It's not the first time his expectations were dashed. > and period arithmetic In naive time, period arithmetic with timedelta works fine for units <= weeks. There _were_ use cases for many more kinds of period arithmetic, although they were still all in naive time. But specifying and implementing all of that too is a major project of its own, and we ran out of time. For example, look at all the text it takes to explain RRULE in the iCalendar spec: http://www.ietf.org/rfc/rfc2445.txt Things like "the first Tuesday after a Monday in November, every 4 years" (which describes US presidential election dates) are just the start. Speaking of which, I think it would be insane to try to express such complexities by overloading binary arithmetic operators. Unless someone used that every day, they'd soon forget what all the magic meant: "write-only" code. So we punted on that, hoping someone else would take up the challenge. And, e.g., I believe "dateutil" does implement the whole RRULE spec. > and time zone conversion are high level ones. They're a world of pain unto themselves ;-) And another case where there wan't enough time to do a full-blown job, so only an abstract tzinfo base class was released at first. > The original design of datetime was to expose all of the low level > details It was primarily to implement "naive time" because that alone sufficed to meet the vast majority of the requirements. > so that people could implement their own high level stuff, because > we're programmers and that's what we do, ya know? We certainly did hope people could build on it, but this wasn't so much driven by philosophy as by that our employer was getting visibly (& understandably!) annoyed with continuing to pay for datetime development after it met every major use case identified in the requirements phase. We had to cut it off. > All of this is to say that I agree with Tim that `first=True` is not a > good default for `replace()`. While that method verifies that all of its > numerical arguments are within their valid ranges, it doesn't verify |> that the resulting time exists in its time zone, so for consistency I > would expect that it simply keep the value of the `first` flag without > validation or modification if that argument is not provided. I believe Alexander already agreed with this. > But this raises the question of how an unambiguous datetime with > first=False should be handled by other code. My preference is that > all high level operations should treat such datetimes as having > first=True. I agree this needs to be cleared up. The _actual_ "rules" now appear to be: 1. first==False means this is the later of two ambiguous times. 2. first==True means anything else (it's the earlier of two ambiguous times, or it's not an ambiguous time at all). 3. And, by the way, #1 was fibbing: first=False may also mean it's not an ambiguous time. However, so far #3 may only be in internal uses (like the .fromutc() implementation fiddling the flag to trick .utcoffset() into telling it something useful). I don't think it needs to be decided just yet. As things get implemented, the delights and drawbacks will get clearer. > Also, given this divide, it would be good to document in the datetime > module which methods are high level and which are low. Well, since that distinction doesn't exist in my head, you'll have to write the doc patch ;-) ... >> Are all known DST adjustments, and changes to standard UTC offsets -- >> in, say, zoneinfo -- exactly one hour? For example, I read this on >> the web, so it must be true: "Lord Howe Island (Australia) advances >> its clocks by half an hour in the summer" ;-) Since I expect we >> expect to support all the goofy timezones in zoneinfo, best to get >> that right from the start. > Also relevant is the fact that this rationale fails in such cases: > > """ > We chose the minute byte to store the the "first" bit because this choice preserves the natural ordering. > """ Sorry, I didn't grasp your meaning there. But I didn't grasp Alexander's intent either ;-) From "later ambiguous time" alone we have no idea _how_ much later, so the bit-fiddling comment there didn't make sense to me. Perhaps that was what you were getting at? The C-level datetime comparison code is much lower-level than in datetime.py, and the former actually compares raw bytestrings: if (GET_DT_TZINFO(self) == GET_DT_TZINFO(other)) { diff = memcmp(((PyDateTime_DateTime *)self)->data, ((PyDateTime_DateTime *)other)->data, _PyDateTime_DATETIME_DATASIZE); Stuffing a flag into any of the existing bytes is bound to break some case there. I'd add another byte to the pickle, but then I'm no longer paid to obsess over bytes ;-) > ... > I agree with dropping the part about comparisons, since that will > no longer be true for datetimes with tzstrict time zones if that part > of the discussion comes to fruition. Yup! It's a big step in the right direction if we can just get ambiguous local->utc conversions working correctly first. From chris.barker at noaa.gov Wed Aug 5 16:02:12 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 5 Aug 2015 07:02:12 -0700 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> Message-ID: <1744706367558826763@unknownmsgid> > Python isn't going to support leap seconds out of the box. Got it -- no problem there. > We > (primarily meaning Alexander) _are_ trying to make it possible for > someone who wants to supply an external implementation to do so, That's all I'm sayin' And thank you Alexander for putting so much effort into this. -Chris From ischwabacher at wisc.edu Wed Aug 5 17:31:39 2015 From: ischwabacher at wisc.edu (ISAAC J SCHWABACHER) Date: Wed, 05 Aug 2015 15:31:39 +0000 Subject: [Datetime-SIG] Local time disambiguation proposal In-Reply-To: References: Message-ID: Sounds like you understand everything I'm saying, and I understand some of what you're saying. :) Top-posted from Microsoft Outlook Web App; may its designers be consigned for eternity to that circle of hell in which their dog food is consumed. ________________________________________ From: Tim Peters Sent: Tuesday, August 4, 2015 22:43 To: ISAAC J SCHWABACHER Cc: Alexander Belopolsky; datetime-sig Subject: Re: [Datetime-SIG] Local time disambiguation proposal [ijs] > Not so long ago I think I finally got a point that Tim has > been dancing around throughout this whole discussion, > without quite saying it outright. (Another possibility > would be Reading Comprehension Fail on my part...) I have to suggest another: that datetime is so far from what you would have designed that you just can't believe Guido didn't have your design in mind all along and was just too harried to implement it properly. See? I don't dance at all ;-) For example, I'm pretty sure you would have designed datetime to store a duration from an epoch. But that was never on the table. It was an explicit requirement to maintain year, month, day, hour ... attributes separately, in both the storage ("pickle") and in-memory formats. That's because conversions between that and a duration is expensive, and many use cases required quick access to the attributes. For example, business-oriented web apps typically do little (if any) time arithmetic, but are forever reading up datetimes from a database and needing to display them in human-readable form (show the attribute values). Sometimes with a time zone indicator attached. And sometimes converted to the viewer's timezone. > Classic arithmetic and `replace()` are low level operations > on datetimes, I can speak about .replace() definitively since I "invented" it: it'is just intended to be shorthand for calling a constructor in cases where "most of" the fields retain the same values. It was born of necessity, because I found an early datetime prototype unbearably tedious to use just for writing unit tests. Because datetime objects are immutable, "changing a single attribute" is horridly verbose without .replace(). > whereas duration The only intended support for durations was in "naive time". As Guido said earlier, he expected that people who needed more than that would continue using timestamps. It's not the first time his expectations were dashed. > and period arithmetic In naive time, period arithmetic with timedelta works fine for units <= weeks. There _were_ use cases for many more kinds of period arithmetic, although they were still all in naive time. But specifying and implementing all of that too is a major project of its own, and we ran out of time. For example, look at all the text it takes to explain RRULE in the iCalendar spec: http://www.ietf.org/rfc/rfc2445.txt Things like "the first Tuesday after a Monday in November, every 4 years" (which describes US presidential election dates) are just the start. Speaking of which, I think it would be insane to try to express such complexities by overloading binary arithmetic operators. Unless someone used that every day, they'd soon forget what all the magic meant: "write-only" code. So we punted on that, hoping someone else would take up the challenge. And, e.g., I believe "dateutil" does implement the whole RRULE spec. > and time zone conversion are high level ones. They're a world of pain unto themselves ;-) And another case where there wan't enough time to do a full-blown job, so only an abstract tzinfo base class was released at first. > The original design of datetime was to expose all of the low level > details It was primarily to implement "naive time" because that alone sufficed to meet the vast majority of the requirements. > so that people could implement their own high level stuff, because > we're programmers and that's what we do, ya know? We certainly did hope people could build on it, but this wasn't so much driven by philosophy as by that our employer was getting visibly (& understandably!) annoyed with continuing to pay for datetime development after it met every major use case identified in the requirements phase. We had to cut it off. > All of this is to say that I agree with Tim that `first=True` is not a > good default for `replace()`. While that method verifies that all of its > numerical arguments are within their valid ranges, it doesn't verify |> that the resulting time exists in its time zone, so for consistency I > would expect that it simply keep the value of the `first` flag without > validation or modification if that argument is not provided. I believe Alexander already agreed with this. > But this raises the question of how an unambiguous datetime with > first=False should be handled by other code. My preference is that > all high level operations should treat such datetimes as having > first=True. I agree this needs to be cleared up. The _actual_ "rules" now appear to be: 1. first==False means this is the later of two ambiguous times. 2. first==True means anything else (it's the earlier of two ambiguous times, or it's not an ambiguous time at all). 3. And, by the way, #1 was fibbing: first=False may also mean it's not an ambiguous time. However, so far #3 may only be in internal uses (like the .fromutc() implementation fiddling the flag to trick .utcoffset() into telling it something useful). I don't think it needs to be decided just yet. As things get implemented, the delights and drawbacks will get clearer. > Also, given this divide, it would be good to document in the datetime > module which methods are high level and which are low. Well, since that distinction doesn't exist in my head, you'll have to write the doc patch ;-) ... >> Are all known DST adjustments, and changes to standard UTC offsets -- >> in, say, zoneinfo -- exactly one hour? For example, I read this on >> the web, so it must be true: "Lord Howe Island (Australia) advances >> its clocks by half an hour in the summer" ;-) Since I expect we >> expect to support all the goofy timezones in zoneinfo, best to get >> that right from the start. > Also relevant is the fact that this rationale fails in such cases: > > """ > We chose the minute byte to store the the "first" bit because this choice preserves the natural ordering. > """ Sorry, I didn't grasp your meaning there. But I didn't grasp Alexander's intent either ;-) From "later ambiguous time" alone we have no idea _how_ much later, so the bit-fiddling comment there didn't make sense to me. Perhaps that was what you were getting at? The C-level datetime comparison code is much lower-level than in datetime.py, and the former actually compares raw bytestrings: if (GET_DT_TZINFO(self) == GET_DT_TZINFO(other)) { diff = memcmp(((PyDateTime_DateTime *)self)->data, ((PyDateTime_DateTime *)other)->data, _PyDateTime_DATETIME_DATASIZE); Stuffing a flag into any of the existing bytes is bound to break some case there. I'd add another byte to the pickle, but then I'm no longer paid to obsess over bytes ;-) > ... > I agree with dropping the part about comparisons, since that will > no longer be true for datetimes with tzstrict time zones if that part > of the discussion comes to fruition. Yup! It's a big step in the right direction if we can just get ambiguous local->utc conversions working correctly first. From chris.barker at noaa.gov Wed Aug 5 16:31:25 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 5 Aug 2015 07:31:25 -0700 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> Message-ID: <2497717646148121916@unknownmsgid> >> Exactly. So for now, we simply document that Naive time IS TAI time, > > But it's not. You're free to think of a naive datetime as > representing a TAI time if you like, but that's as far it goes. You > are, e.g., equally free to think of it as being a GPS time. That's up > to you (TAI and GPS time both fit Python's naive time model - but so > does "Tim's personal time", which has nothing to do with durations in > SI seconds). But naive time is not some arbitrary system unto itself. And while it is naive about time zones, and has no DST transitions, it is very much an implementation of a particular calendar system. And when you add a tzinfo object with methods like utcoffset(), then you are very strongly implying that it is now UTC. Oh, and datetime.datetime.utcnow() kind of implies that, too. The docs do clearly state "no leap seconds" in numerous places. But nudges like me bring it up enough that maybe a bit more in the docs would be helpful. I'll contemplate a doc patch. -Chris > > For the rest, this is the high-order bit: > >> ... >> This is uglier than it should be because a lot (most?) time libraries don't >> do leap seconds, > > Can you name one that does? Not a single one has been mentioned in > this entire mountain of messages so far (that I saw). Even Common > LISP, which can usually be counted on to be as anal as conceivable ;-) > , ignores leap seconds. > > Which is a primary reason Python will always ignore them too: it > makes Python's core calculations compatible with every other > programming language's on Earth. > > If you need surprise-free durations in SI seconds, use TAI. If your > colleagues create a Tower of Babel, each using their own definition of > time, that's far more a social problem than a technical one. Agree on > a standard, and - poof! - no more communication or conversion problems > in this area - and if you pick TAI, the intended meaning of the data > will remain clear until the last human dies ;-) > > There are ways to get at "real UTC" (and TAI, and UT1, and ...) from > Python now, but they're in specialized packages few people know about. > That's appropriate, because few people actually need them. They would > confuse people who don't need them. For example, just try to picture > 99.997% of users trying to make sense of the astropy docs: > > http://astropy.readthedocs.org/en/stable/time/index.html > > The astropy.time package provides functionality for manipulating > times and dates. Specific emphasis is placed on supporting > time scales (e.g. UTC, TAI, UT1, TDB) and time > representations (e.g. JD, MJD, ISO 8601) that are used in > astronomy and required to calculate, e.g., sidereal times and > barycentric corrections. > > It's wonderful that such packages exist, but they don't belong in the core. From tim.peters at gmail.com Wed Aug 5 17:43:03 2015 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 5 Aug 2015 10:43:03 -0500 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: <2497717646148121916@unknownmsgid> References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> <2497717646148121916@unknownmsgid> Message-ID: [Chris Barker] > But naive time is not some arbitrary system unto itself. And while it > is naive about time zones, and has no DST transitions, it is very much > an implementation of a particular calendar system. > > And when you add a tzinfo object with methods like utcoffset(), then > you are very strongly implying that it is now UTC. Oh, and > datetime.datetime.utcnow() kind of implies that, too. Much the same can be said about every other programming language on Earth with a standard implementation of dates and times that includes some notion of time zones. They all ignore leap seconds, and all implement the same approximation to real-life UTC as Python (although not all support the same range of years. not all accept Gregorian dates before the date that calendar system was first adopted, some support Julian dates, ...), and all call it "UTC" anyway. > The docs do clearly state "no leap seconds" in numerous places. But > nudges like me bring it up enough that maybe a bit more in the docs > would be helpful. > > I'll contemplate a doc patch. I'd say the _body_ of the docs are quite involved enough already. Adding more warnings about things few people actually care about would be a disservice to most users. Here's an analogy. That Python supports the literal 0.1 strongly implies that _means_ "one tenth" exactly. But it doesn't, We see "bug reports" related to that many times every year. It's the F-est of FAQs. But it's again something that's not news to anyone with any experience. "Floating point" is all most programmers need to hear about that, just like "ignores leap seconds" is all most programmers need to hear about Python's treatment of UTC. "Been there, done that, same as everyone else" in both cases. The docs would be ill-served too by droning on about the differences between real numbers and binary floating-point. Most programmers already have some understanding of that, while others need a _lot_ of words to disabuse them of even their shallowest illusions ;-) Instead I wrote an appendix on the topic for the Tutorial, which may well be the most frequently referenced piece of the docs in replies to bug reports: https://docs.python.org/3/tutorial/floatingpoint.html That's worked well. A wordy intro is there to point to when someone really needs it, but the language and library docs aren't cluttered with it. A similar approach may be appropriate for going on at length about the subtleties of Python's UTC vs real-life UTC vs POSIX vs ... Most users truly don't care, but it "would be nice" to have a full account for those who do. It's hard to provide guidance on what such a thing should cover, because bug reports related to it are historically rare. Here's the most directly relevant one I could find, from about 5 years ago: http://bugs.python.org/issue4775 Perhaps ironically, that wasn't filed by someone with little knowledge of time schemes, but by someone with "too much" knowledge ;-) So that's a challenge: a writeup has to be easy to understand for rank newbies yet use technical terms so precisely that bona fide experts are satisfied too. It's also hard to know where to stop. If you want to explain _everything_ about "real life" UTC, then you have to explain all this too (pasted from Wikipedia's article on "Unix time"): """ The present form of UTC, with leap seconds, is defined only from 1 January 1972 onwards. Prior to that, since 1 January 1961 there was an older form of UTC in which not only were there occasional time steps, which were by non-integer numbers of seconds, but also the UTC second was slightly longer than the SI second, and periodically changed to continuously approximate the Earth's rotation. Prior to 1961 there was no UTC, and prior to 1958 there was no widespread atomic timekeeping; in these eras, some approximation of GMT (based directly on the Earth's rotation) was used instead of an atomic timescale. """ Who cares about any of that? Hardly anyone - but some people do. From mal at egenix.com Wed Aug 5 18:26:02 2015 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 05 Aug 2015 18:26:02 +0200 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> <2497717646148121916@unknownmsgid> Message-ID: <55C2391A.5000405@egenix.com> I think you are mixing up a few things here, or I'm just misunderstanding you, which is just as possible (date/time is full of wonders, so no surprise there :-)) If you have a local date/time value which references a day and an time on that day, you can convert this into UTC without caring about leap seconds at all. The reason is that the conversion is done to, again, a date and a time on that day. The problems with leap seconds only matter when you care about time differences and then only if you need the elapsed time calculated in seconds when spanning multiple days. Most business applications only care about time differences calculated in number of days, or seconds between two times on a single day. In finance, you often even ignore complete days for the sake of simplicity (years having 360 days, all months having 30 days, etc. - there's a whole bunch of different systems to choose from). The few times where you really want to know the number of elapsed SI seconds between two points in time, you will most likely not use date/time objects for the calculation anyway, but instead revert to floats or integers counting nano seconds. For those cases, I think it's perfectly fine to have a package on PyPI which deals with the conversion from the point time to the value you are dealing with in your calculations and back again, so complicating the stdlib for this doesn't appear to me to be a good idea. BTW: In mxDateTime I have long resisted adding TZ code. The only TZ support currently in the system is for doing the one time conversion to UTC and back to local time again for display to the user when requested. IMHO, times should always be stored in "UTC", since that's the only half sane time standard that's understood by enough people to get decent interoperable work done (+/- 36 seconds that is, depending on who you talk to ;-)). Regarding terms: I started with "GMT" in mxDateTime, then added "UTC" as alias, and may well add "TAI" at some point as another alias. For most people, these are all the same, anyway :-) Purists will loudly disagree, of course, but I have the Zen of Python to my rescue: practicality beats purity. Cheers, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 05 2015) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From tim.peters at gmail.com Wed Aug 5 22:01:28 2015 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 5 Aug 2015 15:01:28 -0500 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: <55C2391A.5000405@egenix.com> References: <55BA88F8.4080105@oddbird.net> <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> <2497717646148121916@unknownmsgid> <55C2391A.5000405@egenix.com> Message-ID: [M.-A. Lemburg ] > I think you are mixing up a few things here, or I'm just > misunderstanding you, which is just as possible (date/time is > full of wonders, so no surprise there :-)) > > If you have a local date/time value which references a day and an time > on that day, you can convert this into UTC without caring about > leap seconds at all. The reason is that the conversion is done > to, again, a date and a time on that day. I think almost everyone here does understand all this, although I don't believe your last point was explicitly pointed out before. It's worth pointing out! A time entered as YYMMDD HHMMSS can be viewed as specifying an exact real-life UTC time, and so can (within the limitations of the system clock) asking "what time is it now?". For most people, that's enough. Leap seconds do cause the TAI and UTC Gregorian calenders to drift more & more out of synch over time, but, as you say, those so inclined can view each new day as starting exactly in synch with real-life UTC again. What they can't do with the builtin stuff is get exact duration in SI seconds between two datetimes viewed as representing real-life UTC. Which Chris wants to do. Endlessly ;-) > ... > Regarding terms: I started with "GMT" in mxDateTime, then > added "UTC" as alias, and may well add "TAI" at some point as > another alias. For most people, these are all the same, anyway :-) Don't add TAI as an alias. Only a relative handful of people have ever heard of it. Among those who both have and care about it, it has a precisely defined meaning which isn't satisfied at all unless they _can_ get elapsed SI seconds by subtracting. That's a primary use case for TAI. The primary use case for UTC is to approximate UT1 ("solar time") in a way that doesn't require changing what "a second" means. as a duration, at every instant. > Purists will loudly disagree, of course, but I have the Zen > of Python to my rescue: practicality beats purity. So long as you add "Refuse the temptation to add TAI as an alias for UTC" ;-) From mal at egenix.com Thu Aug 6 01:18:38 2015 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 06 Aug 2015 01:18:38 +0200 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: References: <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> <2497717646148121916@unknownmsgid> <55C2391A.5000405@egenix.com> Message-ID: <55C299CE.9060301@egenix.com> On 05.08.2015 22:01, Tim Peters wrote: > [M.-A. Lemburg ] >> I think you are mixing up a few things here, or I'm just >> misunderstanding you, which is just as possible (date/time is >> full of wonders, so no surprise there :-)) >> >> If you have a local date/time value which references a day and an time >> on that day, you can convert this into UTC without caring about >> leap seconds at all. The reason is that the conversion is done >> to, again, a date and a time on that day. > > I think almost everyone here does understand all this, although I > don't believe your last point was explicitly pointed out before. It's > worth pointing out! A time entered as YYMMDD HHMMSS can be viewed as > specifying an exact real-life UTC time, and so can (within the > limitations of the system clock) asking "what time is it now?". For > most people, that's enough. > > Leap seconds do cause the TAI and UTC Gregorian calenders to drift > more & more out of synch over time, but, as you say, those so inclined > can view each new day as starting exactly in synch with real-life UTC > again. What they can't do with the builtin stuff is get exact > duration in SI seconds between two datetimes viewed as representing > real-life UTC. > > Which Chris wants to do. Endlessly ;-) Looking at the Earth's irregular rotation and the many factors influencing it, I wouldn't be too sure about an everlasting increase in UTC-TAI difference, but you're probably right about this not changing in our lifetime :-) These are the fine folks modelling all this: http://hpiers.obspm.fr/eop-pc/index.php (note how the length of a celestial day changes even within each year) and here's a list of factors that influence the Earth's rotation: http://hpiers.obspm.fr/eop-pc/index.php?index=excitation&lang=en and their direct influence on the UTC-TAI difference: http://hpiers.obspm.fr/eop-pc/index.php?index=leapsecond&lang=en (note how we've managed to slow down the increase in difference around the year 2000). Oh, and BTW, these are the result of their latest poll regarding a redefinition of UTC without leap seconds: http://hpiers.obspm.fr/eop-pc/questionnaire/reponse_questionnaire.html Doesn't look like it's going to change anytime soon :-) I guess we'll just end up with more advanced notice of new leap second additions. OTOH, if we do more research into what happened around 2000, we might even get UTC back in line with TAI. Perhaps we just need more Internet bubbles bursting ;-) >> ... >> Regarding terms: I started with "GMT" in mxDateTime, then >> added "UTC" as alias, and may well add "TAI" at some point as >> another alias. For most people, these are all the same, anyway :-) > > Don't add TAI as an alias. Only a relative handful of people have > ever heard of it. Among those who both have and care about it, it has > a precisely defined meaning which isn't satisfied at all unless they > _can_ get elapsed SI seconds by subtracting. That's a primary use > case for TAI. The primary use case for UTC is to approximate UT1 > ("solar time") in a way that doesn't require changing what "a second" > means. as a duration, at every instant. > >> Purists will loudly disagree, of course, but I have the Zen >> of Python to my rescue: practicality beats purity. > > So long as you add "Refuse the temptation to add TAI as an alias for UTC" ;-) Hmm, you're probably right... for now. Will check back for mxDateTime's 25th anniversary :-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 06 2015) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From alexander.belopolsky at gmail.com Thu Aug 6 02:06:17 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 5 Aug 2015 20:06:17 -0400 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: <55C299CE.9060301@egenix.com> References: <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> <2497717646148121916@unknownmsgid> <55C2391A.5000405@egenix.com> <55C299CE.9060301@egenix.com> Message-ID: On Wed, Aug 5, 2015 at 7:18 PM, M.-A. Lemburg wrote: > and their direct influence on the UTC-TAI difference: > > http://hpiers.obspm.fr/eop-pc/index.php?index=leapsecond&lang=en Cool plot, but "due to the initial choice of the value of the second (1/86400 mean solar day of the year 1820)" sounds like nonsense. How did they measure the "mean solar day of the year 1820" in caesium-133 radiation periods without sending someone back in time with an atomic clock? I thought only Guido had a time machine! From mal at egenix.com Thu Aug 6 11:46:01 2015 From: mal at egenix.com (M.-A. Lemburg) Date: Thu, 06 Aug 2015 11:46:01 +0200 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: References: <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> <2497717646148121916@unknownmsgid> <55C2391A.5000405@egenix.com> <55C299CE.9060301@egenix.com> Message-ID: <55C32CD9.6010104@egenix.com> On 06.08.2015 02:06, Alexander Belopolsky wrote: > On Wed, Aug 5, 2015 at 7:18 PM, M.-A. Lemburg wrote: >> and their direct influence on the UTC-TAI difference: >> >> http://hpiers.obspm.fr/eop-pc/index.php?index=leapsecond&lang=en > > Cool plot, but "due to the initial choice of the value of the second > (1/86400 mean solar day of the year 1820)" sounds like nonsense. How > did they measure the "mean solar day of the year 1820" in caesium-133 > radiation periods without sending someone back in time with an atomic > clock? I thought only Guido had a time machine! I guess in this case, it's more a coincident than Guido lending someone his time machine :-) In the 18th and 19th century, a second was defined as 1/86400 of a average mean solar day (fraction of a tropical year). At the time, people apparently believed this to be mostly constant. Early in the 1900s, it was found that Earth's rotation is not constant enough to base a standard on it. So the definition was adapted to mean a certain fraction of a specific tropical year (rather than an average over many years), in this case 1900: the ephemeris second. However, not to the effect of making one ephemeris second a 1/86400 fraction of a mean solar day in 1900. In the late 1960s, the definition was again refined to be based on the atomic cesium clocks: the SI second was born. Comparing this definition to the definition used in the 18th and 19th century, it was then determined that one SI second corresponds to one second (using the old definition of the mean solar day, but now for specific years, rather than averages) in the year 1820. This is how we ended up with one SI second = 1/86400 mean solar day of the year 1820. Our time keepers are doing a pretty good job there, I must say :-) More details on all this are available at: http://tycho.usno.navy.mil/leapsec.html (provided their server is up) This also has a nice chart showing how the length of a day varies with time. Hmm, I wonder what traders would make of such a chart - I guess it's time to buy some LOD stocks now :-) Now, if we could only get Earth to behave and speed up it's rotation again... -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 06 2015) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ ::::: Try our mxODBC.Connect Python Database Interface for free ! :::::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From chris.barker at noaa.gov Thu Aug 6 17:11:04 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 6 Aug 2015 08:11:04 -0700 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: <55C299CE.9060301@egenix.com> References: <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> <2497717646148121916@unknownmsgid> <55C2391A.5000405@egenix.com> <55C299CE.9060301@egenix.com> Message-ID: <-7422578592706651825@unknownmsgid> > What they can't do with the builtin stuff is get exact >> duration in SI seconds between two datetimes viewed as representing >> real-life UTC. >> >> Which Chris wants to do. Endlessly ;-) Yes, but only back to 1820 ;-) >> Don't add TAI as an alias. Only a relative handful of people have >> ever heard of it. Among those who both have and care about it, it has >> a precisely defined meaning which isn't satisfied at all unless they >> _can_ get elapsed SI seconds by subtracting. That's a primary use >> case for TAI. Doesn't an implementation of the Proleptic Gregorian Calendar with no leap seconds provide that? e.g. the Python datetime implementation? M-A s note made this all a bit more clear to me: business use cases are a lot more concerned with the Calendar than actual elapsed time. On the other hand, I do scientific applications, and am far more concerned with accurate elapsed time. And yes, most code uses an internal time representation in integer seconds, or the like. But we still have to deal with input and output in human-friendly time. And for that, python's datetime is very, very useful. And primarily because it does a good job with 'strict' timedeltas. In fact, given that integer units of time are the most natural, the primary use of python's datetime that I've seen is to convert to-from a "timespan since an epoch" representation, which requires proper timespan computation. (And a lot of the data I've seen is based on poor choices of epoch, unfortunately) So my entire point here is that it's great to preserve and enhance that capability, and leave the door open to future enhancements. And Alexander's work looks like it's going in the right direction. Thanks for indulging me -- I, at least, learned a lot. -Chris From alexander.belopolsky at gmail.com Thu Aug 6 20:21:41 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 6 Aug 2015 14:21:41 -0400 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: <55C32CD9.6010104@egenix.com> References: <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> <2497717646148121916@unknownmsgid> <55C2391A.5000405@egenix.com> <55C299CE.9060301@egenix.com> <55C32CD9.6010104@egenix.com> Message-ID: On Thu, Aug 6, 2015 at 5:46 AM, M.-A. Lemburg wrote: [Alexander Belopolsky] >> ... "due to the initial choice of the value of the second >> (1/86400 mean solar day of the year 1820)" sounds like nonsense. [MAL] > More details on all this are available at: > http://tycho.usno.navy.mil/leapsec.html (provided their server is up) Thanks for the link! "Modern studies have indicated that the epoch at which the mean solar day was exactly 86,400 SI seconds was approximately 1820." makes much more sense: they did not pick the value for SI second to match 1/86400 mean solar day of the year 1820, it just so happened that the value they picked for other reasons matched later estimates of what we now think 1/86400 mean solar day of the year 1820 was. I now believe it was all a bad PR. What they should have done was to announce that according to the new measurements a "year" is 1 second longer than people thought, so they would add 1 second at the end of June *every year*, but since the Earth is wobbling unpredictably, on some years they will occasionally skip a second in December. If they did that, 30 years later leap second support would be as widespread as support for February 29. From tim.peters at gmail.com Thu Aug 6 20:29:49 2015 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 6 Aug 2015 13:29:49 -0500 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: <-7422578592706651825@unknownmsgid> References: <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> <2497717646148121916@unknownmsgid> <55C2391A.5000405@egenix.com> <55C299CE.9060301@egenix.com> <-7422578592706651825@unknownmsgid> Message-ID: [Tim] >>> Don't add TAI as an alias. Only a relative handful of people have >>> ever heard of it. Among those who both have and care about it, it has >>> a precisely defined meaning which isn't satisfied at all unless they >>> _can_ get elapsed SI seconds by subtracting. That's a primary use >>> case for TAI. [Chris Barker] > Doesn't an implementation of the Proleptic Gregorian Calendar with no > leap seconds provide that? e.g. the Python datetime implementation? TAI has more than one "moving piece". Because _all_ the pieces are important to the relative handful of people who need TAI, it's highly misleading to call something "TAI" unless it implements all the pieces. Provided you view naive datetimes as expressing times in TAI calendar notation, then, yes, Python's calendar notation and classic arithmetic implement those parts of TAI precisely. But there's no builtin support for any other piece needed to use TAI: - There's no way to ask "what time is it now?" and get a TAI result (unless you're on one of the rare rumored Linux variants configured to use TAI for the system clock), neither in TAI seconds-since-the-epoch notation nor in TAI calendar notation. - There's no direct way to ask how a datetime expressed in TAI calendar notation is expressed as TAI seconds-since-the-epoch notation. This is easy to write a function for, though (subtract a naive datetime expressing the start of the TAI epoch in TAI calendar notation). Ditto in the other direction. - There's no way to use the current timezone implementation to convert TAI to or from any other time scheme. > M-A s note made this all a bit more clear to me: business use cases > are a lot more concerned with the Calendar than actual elapsed time. Even worse, businesses generally want _not_ to see actual elapsed time. For example, you rent a car for a week, at $300/week plus $50 for every hour over. You pick it up at the airport at 11am Saturday, hand over $300, and return it by 11am the following Saturday. They demand you pay an extra $50, because DST ended the day after you picked it up. Heh - that's a great way to ensure you never rent from them again ;-) > On the other hand, I do scientific applications, and am far more > concerned with accurate elapsed time. > > And yes, most code uses an internal time representation in integer > seconds, or the like. > > But we still have to deal with input and output in human-friendly > time. And for that, python's datetime is very, very useful. And > primarily because it does a good job with 'strict' timedeltas. In the language we seem to have settled on, timedelta does "classic arithmetic" now, and how it _would_ work if it _did_ account for things like DST transitions is called "strict" or "timeline" arithmetic. > In fact, given that integer units of time are the most natural, the > primary use of python's datetime that I've seen is to convert to-from > a "timespan since an epoch" representation, which requires proper > timespan computation. I don't see how that could be a primary use now, since Python's datetime arithmetic now doesn't do "proper timespan computation" (as I believe you mean it) in any time zone defined as an offset (including 0) from real-life UTC. You can't do it using POSIX seconds-from-the-epoch timestamps either (not directly; you would need additional code to account for leap second adjustments across the span of interest). > (And a lot of the data I've seen is based on poor choices of epoch, > unfortunately) Social problem: nag them ;-) > So my entire point here is that it's great to preserve and enhance > that capability, and leave the door open to future enhancements. And > Alexander's work looks like it's going in the right direction. > > Thanks for indulging me -- I, at least, learned a lot. I just hope some of it finds some actual use ;-) Have you read this yet? It explains a lot. https://en.wikipedia.org/wiki/Unix_time From alexander.belopolsky at gmail.com Thu Aug 6 20:38:14 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 6 Aug 2015 14:38:14 -0400 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: <-7422578592706651825@unknownmsgid> References: <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> <2497717646148121916@unknownmsgid> <55C2391A.5000405@egenix.com> <55C299CE.9060301@egenix.com> <-7422578592706651825@unknownmsgid> Message-ID: On Thu, Aug 6, 2015 at 11:11 AM, Chris Barker - NOAA Federal wrote: > M-A s note made this all a bit more clear to me: business use cases > are a lot more concerned with the Calendar than actual elapsed time. > On the other hand, I do scientific applications, and am far more > concerned with accurate elapsed time. Another point that some people often don't understand is that when they run something like $ TZ=UTC date Thu Aug 6 18:28:49 UTC 2015 on their POSIX compliant computers that run ntpd to sync with the atomic clock, they get a bona fide UTC time with all leap seconds accounted for. It is only when they do $ TZ=UTC date +%s 1438885729 the number they get is *not* the number of seconds elapsed since UTC midnight of 1970-01-01. This number is some 26 seconds off, but how many people care about this number? On the other hand the value that everyone looks at (Thu Aug 6 18:28:49 UTC 2015) is correct on POSIX systems except for that one second that by strict UTC rules should be displayed as 23:59:60. (And, depending on how paranoid your system admin is, you possibly see delayed time for a few hours before and after the leap second insertion.) From chris.barker at noaa.gov Thu Aug 6 21:11:24 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Thu, 6 Aug 2015 12:11:24 -0700 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: References: <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> <2497717646148121916@unknownmsgid> <55C2391A.5000405@egenix.com> <55C299CE.9060301@egenix.com> <-7422578592706651825@unknownmsgid> Message-ID: On Thu, Aug 6, 2015 at 11:29 AM, Tim Peters wrote: > TAI has more than one "moving piece". Because _all_ the pieces are > important to the relative handful of people who need TAI, it's highly > misleading to call something "TAI" unless it implements all the > pieces. > > Provided you view naive datetimes as expressing times in TAI calendar > notation, then, yes, Python's calendar notation and classic arithmetic > implement those parts of TAI precisely. > > But there's no builtin support for any other piece needed to use TAI: > Fair enough. - There's no way to ask "what time is it now?" and get a TAI result > (unless you're on one of the rare rumored Linux variants configured to > use TAI for the system clock), neither in TAI seconds-since-the-epoch > notation nor in TAI calendar notation. > uhm, yeah, we'd need to handle leap-seconds for that :-) > - There's no direct way to ask how a datetime expressed in TAI > calendar notation is expressed as TAI seconds-since-the-epoch > notation. This is easy to write a function for, though (subtract a > naive datetime expressing the start of the TAI epoch in TAI calendar > notation). Ditto in the other direction. > kind of my point -- you can't do that with python datetime and "real" UTC... - There's no way to use the current timezone implementation to convert > TAI to or from any other time scheme. > Ah, yes -- again my point -- I don't think we want to preempt this being possible in the future (probably with an third party lib) > M-A s note made this all a bit more clear to me: business use cases > > are a lot more concerned with the Calendar than actual elapsed time. > > Even worse, businesses generally want _not_ to see actual elapsed > time. right -- even stronger point. > > time. And for that, python's datetime is very, very useful. And > > primarily because it does a good job with 'strict' timedeltas. > > In the language we seem to have settled on, timedelta does "classic > arithmetic" now, and how it _would_ work if it _did_ account for > things like DST transitions is called "strict" or "timeline" > arithmetic. > right -- forgot to sprinkle "naive" liberally around this post.... > I don't see how that could be a primary use now, since Python's > datetime arithmetic now doesn't do "proper timespan computation" (as I > believe you mean it) in any time zone defined as an offset (including > 0) from real-life UTC. > indeed it doesn't -- but that is indeed a primary use now -- but only with naive datetimes -- we have to handle timezones independently. Fortunately, at least in the data I work with, the datetime is expressed in ISO 8601 text form, which lets you specify an offset, but not a politically drive, changes who knows when time zone. So "switch to UTC, use naive datetimes", works fine. But I'm interested in all this because it would be very nice to be abel to properly handle full time zone support. You can't do it using POSIX seconds-from-the-epoch timestamps either > (not directly; you would need additional code to account for leap > second adjustments across the span of interest). Uhhm, but it seems I've been told over and over again in this thread that I shouldn't want leap seconds, no one cares about them, and python will never support them?!??!? So can we just leave it at: "There are no plans for python to support leap seconds, and it is also not a motivating factor in current work to make it possible in the future." When and if (big if) anyone want to try to support leap seconds in the future, we can discuss whether it's possible or a good idea then. I like the idea, but I like a lot of ideas, and for now, I'm happy enough to say that anyone passing me data in a dumb-ass epoch may get their results 26 seconds or so off. > (And a lot of the data I've seen is based on poor choices of epoch, > > unfortunately) > > Social problem: nag them ;-) sure -- but we need to deal with what we need to deal with -- heck if we could drive use cases with software developers needs, we sure as heck wouldn't have DST! As long as it's a 26 or so seconds problem, I'm fine. On the other hand, we just got a data set encode to a "I have no idea why" epoch before 1970, which breaks all our old C code that uses unsigned integers since 1970. So we have to translate that to a better epoch first (using python's datetime, natch). I just hope some of it finds some actual use ;-) at least some will... > Have you read this > yet? It explains a lot. > > https://en.wikipedia.org/wiki/Unix_time thanks for the link -- I _think_ I knew all that, but good to have it one place. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Fri Aug 7 00:11:23 2015 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 6 Aug 2015 17:11:23 -0500 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: References: <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> <2497717646148121916@unknownmsgid> <55C2391A.5000405@egenix.com> <55C299CE.9060301@egenix.com> <-7422578592706651825@unknownmsgid> Message-ID: [Chris Barker] > Uhhm, but it seems I've been told over and over again in this thread that I > shouldn't want leap seconds, no one cares about them, and python will never > support them?!??!? Nobody denies that you need accurate deltas in SI seconds, and nobody suggests that you "shouldn't" need them. But, kinda yes, almost everyone will tell you that only a relative handful (not "no one") of programmers and apps care about exact SI durations. Over half the advice you've been given assumes you would like to solve the problem you have sometime this decade ;-) If you do, then the simplest path to solving it _soon_ is to use TAI internally. Get the very _notion_ of leap seconds (and DST transitions, and base-offset-from-UTC transitions) out of the representation you use for computation, and then the arithmetic needed to work with exact SI durations is simple, reliable, fast and surprise-free. Not just in Python, but in any other language you care to use, and you don't need any changes to any language or library to pursue this path. You would need to write to/from TAI conversion functions yourself (which, in Python, can build on .astimezone() to do the bulk of the work), and maintain a leap seconds table yourself. While not trivial, neither is that a large project. Or you could try using the astropy package, which I pointed at before, which already implements TAI. I believe it's included in all versions of Enthought''s Scientific Python distribution, so I wouldn't be surprised if you - as a scientist - already have it. However, I never used astropy, and don't, e.g., know whether it supports conversions between astronomical and civil time schemes. If you want to talk about the indefinite slobbering ;-) future instead: > So can we just leave it at: > > "There are no plans for python to support leap seconds, and it is also not a > motivating factor in current work to make it possible in the future." I can't speak for everyone on this. Guido already said he's not interested, but Alexander is clearly paying attention to it anyway. For me, it's just another kind of annoying time adjustment, no different in principle from annoying DST adjustments, so I expect we would have to go out of our way to _preclude_ someone implementing leap second support some day. However, I'm skeptical of that anyone wants leap second support enough to do all the work required to implement timeline arithmetic taking account of them. > ... > As long as it's a 26 or so seconds problem, I'm fine. On the other hand, we > just got a data set encode to a "I have no idea why" epoch before 1970, > which breaks all our old C code that uses unsigned integers since 1970. So > we have to translate that to a better epoch first (using python's datetime, > natch). Users. The best part of retirement is that I'm my only user now - and I'm _still_ endlessly annoying to me ;-) From chris.barker at noaa.gov Fri Aug 7 03:15:54 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Thu, 6 Aug 2015 18:15:54 -0700 Subject: [Datetime-SIG] Fwd: Calendar vs timespan calculations... In-Reply-To: References: <55BE66DA.7010206@stoneleaf.us> <7803688847893344661@unknownmsgid> <2497717646148121916@unknownmsgid> <55C2391A.5000405@egenix.com> <55C299CE.9060301@egenix.com> <-7422578592706651825@unknownmsgid> Message-ID: <274212063335535507@unknownmsgid> A little levity: http://xkcd.com/376/ From alexander.belopolsky at gmail.com Sat Aug 8 05:38:51 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 7 Aug 2015 23:38:51 -0400 Subject: [Datetime-SIG] PEP 495: Local time disambiguation Message-ID: I am close enough to finalizing PEP 495 and its reference implementation, that it is time to have some bikeshedding fun and pick the name for the new flag. On Thu, Jul 30, 2015 at 1:46 PM, Alexander Walters wrote: > '.first' also makes the default 'True', and it might just be me, but I don't > like that aesthetically. This is a valid concern and even short experience with datetime.first shows that in the most cases the needed value is (not first). In the early versions of my proposal, I was using a "which" variable with values 0 or 1 as some sort of index into the ordered list of UTC times that correspond to the ambiguous local time, but I soon realized that which=0 or which=1 are devoid of any mnemonic meaning. I also wanted to have a true boolean flag with a meaningful name. The best I could come up with was time(1, 30, first=True) to stand for the "first 01:30" and time(1, 30, first=False) to stand for the "second (not first) 01:30". I did want to have a flag with the opposite meaning, but for obvious reasons, we cannot have a flag called "second". I also rejected "later" because I can never remember when to use "later" and when "latter" and since either of those words could be used as the name of the disambiguation flag, using either of them would be a disservice to our users. After I published the first draft of the PEP, I realized that the second of the two ambiguous hours can also be called "repeated." As I am writing this, I really have no objections to time(1, 30, repeated=True) spelling for the "second 01:30" other than having to make several dozen changes to the PEP and to the implementation code. If I have to make a name change, I would really like to make it only once. So if you hate the first=True default - speak now or forever hold your peace. From alexander.belopolsky at gmail.com Sun Aug 9 00:15:01 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 8 Aug 2015 18:15:01 -0400 Subject: [Datetime-SIG] PEP -0500: A protocol for delegating datetime methods Message-ID: PEP: 500 Title: A protocol for delegating datetime methods to their tzinfo implementations Version: $Revision$ Last-Modified: $Date$ Author: Alexander Belopolsky Discussions-To: Datetime-SIG Status: Draft Type: Standards Track Content-Type: text/x-rst Requires: 495 Created: 08-Aug-2015 Abstract ======== This PEP specifies a new protocol (PDDM - "A Protocol for Delegating Datetime Methods") that can be used by concrete implementations of the ``datetime.tzinfo`` interface to override aware datetime arithmetics, formatting and parsing. We describe changes to the ``datetime.datetime`` class to support the new protocol and propose a new abstract class ``datetime.tzstrict`` that implements parts of this protocol necessary to make aware datetime instances to follow "strict" arithmetic rules. Rationale ========= As of Python 3.5, aware datetime instances that share a ``tzinfo`` object follow the rules of arithmetics that are induced by a simple bijection between (year, month, day, hour, minute, second, microsecond) 7-tuples and large integers. In this arithmetics, the difference between YEAR-11-02T12:00 and YEAR-11-01T12:00 is always 24 hours, even though in the US/Eastern timezone, for example, there are 25 hours between 2014-11-01T12:00 and 2014-11-02T12:00 because the local clocks were rolled back one hour at 2014-11-02T02:00, introducing an extra hour in the night between 2014-11-01 and 2014-11-02. Many business applications requre the use of Python's simplified view of local dates. No self-respecting car rental company will charge its customers more for a week that straddles the end of DST than for any other week or require that they return the car an hour early. Therefore, changing the current rules for aware datetime arithmetics will not only create a backward compatibility nightmare, it will eliminate support for legitimate and common use cases. Since it is impossible to choose universal rules for local time arithmetics, we propose to delegate implementation of those rules to the classes that implement ``datetime.tzinfo`` interface. With such delegation in place, users will be able to choose between different arithmetics by simply picking instances of different classes for the value of ``tzinfo``. Protocol ======== Subtraction of datetime ----------------------- A ``tzinfo`` subclass supporting the PDDM, may define a method called ``__datetime_diff__`` that should take two ``datetime.datetime`` instances and return a ``datetime.timedelta`` instance representing the time elapced from the time represented by the first datetime instance to another. Addition -------- A ``tzinfo`` subclass supporting the PDDM, may define a method called ``__datetime_add__`` that should take two arguments--a datetime and a timedelta instances--and return a datetime instance. Subtraction of timedelta ------------------------ A ``tzinfo`` subclass supporting the PDDM, may define a method called ``__datetime_sub__`` that should take two arguments--a datetime and a timedelta instances--and return a datetime instance. Formatting ---------- A ``tzinfo`` subclass supporting the PDDM, may define methods called ``__datetime_isoformat__`` and ``__datetime_strftime__``. The ``__datetime_isoformat__`` method should take a datetime instance and an optional separator and produce a string representation of the given datetime instance. The ``__datetime_strftime__`` method should take a datetime instance and a format string and produce a string representation of the given datetime instance formatted according to the given format. Parsing ------- A ``tzinfo`` subclass supporting the PDDM, may define a class method called ``__datetime_strptime__`` and register the "canonical" names of the timezones that it implements with a registry. **TODO** Describe a registry. Changes to datetime methods =========================== Subtraction ----------- :: class datetime: def __sub__(self, other): if isinstance(other, datetime): try: self_diff = self.tzinfo.__datetime_diff__ except AttributeError: self_diff = None try: other_diff = self.tzinfo.__datetime_diff__ except AttributeError: other_diff = None if self_diff is not None: if self_diff is not other_diff and self_diff.__func__ is not other_diff.__func__: raise ValueError("Cannot find difference of two datetimes with " "different tzinfo.__datetime_diff__ implementations.") return self_diff(self, other) elif isinstance(other, timedelta): try: sub = self.tzinfo.__datetime_sub__ except AttributeError: pass else: return sub(self, other) return self + -other else: return NotImplemented # current implementation Addition -------- Addition of a timedelta to a datetime instance will be delegated to the ``self.tzinfo.__datetime_add__`` method whenever it is defined. Strict arithmetics ================== A new abstract subclass of ``datetime.tzinfo`` class called ``datetime.tzstrict`` will be added to the ``datetime`` module. This subclass will not implement the ``utcoffset()``, ``tzname()`` or ``dst()`` methods, but will implement some of the methods of the PDDM. The PDDM methods implemented by ``tzstrict`` will be equivalent to the following:: class tzstrict(tzinfo): def __datetime_diff__(self, dt1, dt2): utc_dt1 = dt1.astimezone(timezone.utc) utc_dt2 = dt2.astimezone(timezone.utc) return utc_dt2 - utc_dt1 def __datetime_add__(self, dt, delta): utc_dt = dt.astimezone(timezone.utc) return (utc_dt + delta).astimezone(self) def __datetime_sub__(self, dt, delta): utc_dt = dt.astimezone(timezone.utc) return (utc_dt - delta).astimezone(self) Parsing and formatting ---------------------- Datetime methods ``strftime`` and ``isoformat`` will delegate to the namesake methods of their ``tzinfo`` members whenever those methods are defined. When the ``datetime.strptime`` method is given a format string that contains a ``%Z`` instruction, it will lookup the ``tzinfo`` implementation in the registry by the given timezone name and call its ``__datetime_strptime__`` method. Applications ============ This PEP will enable third party implementation of many different timekeeping schemes including: * Julian / Microsoft Excel calendar. * "Right" timezones with the leap second support. * French revolutionary calendar (with a lot of work). Copyright ========= This document has been placed in the public domain. From chris.barker at noaa.gov Tue Aug 11 03:48:47 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 10 Aug 2015 18:48:47 -0700 Subject: [Datetime-SIG] PEP -0500: A protocol for delegating datetime methods In-Reply-To: References: Message-ID: Alexander, Thanks for writing this up. It does look like this is a way forward to more complete datetime use cases. However, IIUC, this approach essentially delegates all logic to the tzinfo objects, leaving datetime as nothing more than a container for, well, the datetime -- I.e not really much more than a namedtuple holding year, month, day, hour, min, sec, usec. This may well be the best way to move forward while maintaining backward compatibility, but it sure seems like a confusing mixture of functionality, and, while it can support both different calendars and different calendar math, it seems like it will get ugly to mix and match. ( though maybe it can be done with multiple inheritance). However, from an object oriented design perspective, I see: 1) A timezone is a particular definition-- it is essentially a set of (fairly arbitrary, political, and changing) rules for defining an offset to UTC for various times and places. That's it, and that's what the current tzinfo captures. 2) timedelta is a duration in time -- essentially microseconds encoded in days + microseconds. A very simple class. 3) datetime is a representation of the Proleptic Gregorian calendar. But as has been pointed out, it is essentially an encoding of microseconds since year 1 of that calendar ( without leap seconds ). It contains the logic for converting from the year, month, day encoding to/from microseconds. It also holds the logic for doing datetime arithmetic. As I understand it, to do addition and subtraction, it convert to milliseconds from year 1, does the math, then converts back. So: If you want a new timezone, then a new tzinfo object makes sense. But if you want a different calendar (french revolutionary, or maybe gregorian with leap seconds...) then what you want is new datetime object (maybe a datetime subclass), not a new tzinfo object. If you want math to be handled differently, then you also want a new datetime object. And, in fact, the current datetime object does "strict" math (duration arithmetic) just fine -- what it doesn't do is handle the time zones in a particularly useful way when doing math. (i.e is then does Period arithmetic, but only in a very limited way.) So is there any reason to delegate everything to the tzinfo object??? If one were to make a new datetime object to support leap seconds, or some other calendar, wouldn't you be able to use it with the tzinfo objects provided by pytz, etc? So what do we want to be able to support? 1) "strict" or duration arithmetic on time zone aware datetimes: - so what this really needs is change in the datetime logic, it should: - convert to UTC - do the math - convert back to the desired timezone. But we don't want to change current behavior, so we could either: - have an optional flag (strict=True) on the tzinfo object that datetime would check, so it knows what kind of arithmetic you want. -- I don't think the flag really belongs there, but since this is only an issue where tzinfo objects are involved it makes some sense to put it there. 2) Full featured Period arithmetic: the current -- you can get period arithmetic, but only with units of days, and only if the timezones are the same (I think...) -- is, shall we say, not as useful as it could be. I don't need it personally, but I understand that dateutils provides a lot of it. So we need to just make sure we don't break datetutils with any of this -- and I can't see how adding a "strict" flag to a tzinfo object would cause a problem. 3) Leap seconds: OK - maybe no one will ever get around to doing this, but it seems adding them to a new datetime object would be a fine way to do that -- and that datetime object would still work with pytz, datetutils, etc.... 4) Other Calendars: Julian, French Revolutionary, climate modeling 360 day, etc... These sure seem to belong in new datetime objects/subclasses. So what am I missing??? (BTW, this is all assuming that the datetime/tzinfo protocol is adapted to accommodate the ambiguous times in DST boundaries...) -Chris On Sat, Aug 8, 2015 at 3:15 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > PEP: 500 > Title: A protocol for delegating datetime methods to their > tzinfo implementations > Version: $Revision$ > Last-Modified: $Date$ > Author: Alexander Belopolsky > Discussions-To: Datetime-SIG > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Requires: 495 > Created: 08-Aug-2015 > > > Abstract > ======== > > This PEP specifies a new protocol (PDDM - "A Protocol for Delegating > Datetime Methods") that can be used by concrete implementations of the > ``datetime.tzinfo`` interface to override aware datetime arithmetics, > formatting and parsing. We describe changes to the > ``datetime.datetime`` class to support the new protocol and propose a > new abstract class ``datetime.tzstrict`` that implements parts of this > protocol necessary to make aware datetime instances to follow "strict" > arithmetic rules. > > > Rationale > ========= > > As of Python 3.5, aware datetime instances that share a ``tzinfo`` > object follow the rules of arithmetics that are induced by a simple > bijection between (year, month, day, hour, minute, second, > microsecond) 7-tuples and large integers. In this arithmetics, the > difference between YEAR-11-02T12:00 and YEAR-11-01T12:00 is always 24 > hours, even though in the US/Eastern timezone, for example, there are > 25 hours between 2014-11-01T12:00 and 2014-11-02T12:00 because the > local clocks were rolled back one hour at 2014-11-02T02:00, > introducing an extra hour in the night between 2014-11-01 and > 2014-11-02. > > Many business applications requre the use of Python's simplified view > of local dates. No self-respecting car rental company will charge its > customers more for a week that straddles the end of DST than for any > other week or require that they return the car an hour early. > Therefore, changing the current rules for aware datetime arithmetics > will not only create a backward compatibility nightmare, it will > eliminate support for legitimate and common use cases. > > Since it is impossible to choose universal rules for local time > arithmetics, we propose to delegate implementation of those rules to > the classes that implement ``datetime.tzinfo`` interface. With such > delegation in place, users will be able to choose between different > arithmetics by simply picking instances of different classes for the > value of ``tzinfo``. > > > Protocol > ======== > > Subtraction of datetime > ----------------------- > > A ``tzinfo`` subclass supporting the PDDM, may define a method called > ``__datetime_diff__`` that should take two ``datetime.datetime`` > instances and return a ``datetime.timedelta`` instance representing > the time elapced from the time represented by the first datetime > instance to another. > > > Addition > -------- > > A ``tzinfo`` subclass supporting the PDDM, may define a method called > ``__datetime_add__`` that should take two arguments--a datetime and a > timedelta instances--and return a datetime instance. > > > Subtraction of timedelta > ------------------------ > > A ``tzinfo`` subclass supporting the PDDM, may define a method called > ``__datetime_sub__`` that should take two arguments--a datetime and a > timedelta instances--and return a datetime instance. > > > Formatting > ---------- > > A ``tzinfo`` subclass supporting the PDDM, may define methods called > ``__datetime_isoformat__`` and ``__datetime_strftime__``. > > The ``__datetime_isoformat__`` method should take a datetime instance > and an optional separator and produce a string representation of the > given datetime instance. > > The ``__datetime_strftime__`` method should take a datetime instance > and a format string and produce a string representation of the given > datetime instance formatted according to the given format. > > > Parsing > ------- > > A ``tzinfo`` subclass supporting the PDDM, may define a class method > called ``__datetime_strptime__`` and register the "canonical" names of > the timezones that it implements with a registry. **TODO** Describe a > registry. > > > Changes to datetime methods > =========================== > > Subtraction > ----------- > > :: > > class datetime: > def __sub__(self, other): > if isinstance(other, datetime): > try: > self_diff = self.tzinfo.__datetime_diff__ > except AttributeError: > self_diff = None > try: > other_diff = self.tzinfo.__datetime_diff__ > except AttributeError: > other_diff = None > if self_diff is not None: > if self_diff is not other_diff and > self_diff.__func__ is not other_diff.__func__: > raise ValueError("Cannot find difference of two > datetimes with " > "different > tzinfo.__datetime_diff__ implementations.") > return self_diff(self, other) > elif isinstance(other, timedelta): > try: > sub = self.tzinfo.__datetime_sub__ > except AttributeError: > pass > else: > return sub(self, other) > return self + -other > else: > return NotImplemented > # current implementation > > > Addition > -------- > > Addition of a timedelta to a datetime instance will be delegated to the > ``self.tzinfo.__datetime_add__`` method whenever it is defined. > > > Strict arithmetics > ================== > > A new abstract subclass of ``datetime.tzinfo`` class called > ``datetime.tzstrict`` > will be added to the ``datetime`` module. This subclass will not > implement the > ``utcoffset()``, ``tzname()`` or ``dst()`` methods, but will implement > some of the > methods of the PDDM. > > The PDDM methods implemented by ``tzstrict`` will be equivalent to the > following:: > > class tzstrict(tzinfo): > def __datetime_diff__(self, dt1, dt2): > utc_dt1 = dt1.astimezone(timezone.utc) > utc_dt2 = dt2.astimezone(timezone.utc) > return utc_dt2 - utc_dt1 > > def __datetime_add__(self, dt, delta): > utc_dt = dt.astimezone(timezone.utc) > return (utc_dt + delta).astimezone(self) > > def __datetime_sub__(self, dt, delta): > utc_dt = dt.astimezone(timezone.utc) > return (utc_dt - delta).astimezone(self) > > > Parsing and formatting > ---------------------- > > Datetime methods ``strftime`` and ``isoformat`` will delegate to the > namesake > methods of their ``tzinfo`` members whenever those methods are defined. > > When the ``datetime.strptime`` method is given a format string that > contains a ``%Z`` instruction, it will lookup the ``tzinfo`` > implementation in the registry by the given timezone name and call its > ``__datetime_strptime__`` method. > > Applications > ============ > > This PEP will enable third party implementation of many different > timekeeping schemes including: > > * Julian / Microsoft Excel calendar. > * "Right" timezones with the leap second support. > * French revolutionary calendar (with a lot of work). > > Copyright > ========= > > This document has been placed in the public domain. > _______________________________________________ > Datetime-SIG mailing list > Datetime-SIG at python.org > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Aug 12 04:31:00 2015 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 11 Aug 2015 21:31:00 -0500 Subject: [Datetime-SIG] Drop datetime's offset restrictions? Message-ID: I propose dropping datetime's offset restrictions. Currently, datetime's .dst() and .utcoffset() raise an exception if their tzinfo implementations return a timedelta representing anything other than a whole number of minutes in (-1440, 1440) (i.e., magnitude strictly less than a day). The intent was to help catch programmer errors, but I'm not sure it's ever caught one. The "inspiration" came from staring at various C time-and-date libraries at the time, and puzzling over lines like: #define MWEEK 12 Huh? 12 what? Seconds, minutes, time zones? Is "M" short for million, or milli, or modern, or what? But in Python the timedelta constructor is typically used with keyword arguments, making the intended unit blindingly obvious. If you intend 1 hour, you say hours=1; etc. So enforcing restrictions burns time and doc space without clear benefit. The rest of the code doesn't care. Any representable duration should work fine. The Olson ("zoneinfo") database contains historical offsets needing 1-second resolution, which is the smallest unit it can represent. That's a compromise too; e.g., here from its "THEORY" file: Sometimes historical timekeeping was specified more precisely than what the tz database can handle. For example, from 1909 to 1937 Netherlands clocks were legally UT+00:19:32.13, but the tz database cannot represent the fractional second. But Python can: >>> print(timedelta(minutes=19, seconds=32, milliseconds=130)) 0:19:32.130000 I don't really care about supporting such silliness, but neither do I want to frustrate someone who does. Whaddya think? Continue nagging people for no reason at all, or let freedom bloom? ;-) From alexander.belopolsky at gmail.com Wed Aug 12 06:43:07 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 12 Aug 2015 00:43:07 -0400 Subject: [Datetime-SIG] Drop datetime's offset restrictions? In-Reply-To: References: Message-ID: On Tue, Aug 11, 2015 at 10:31 PM, Tim Peters wrote: > I propose dropping datetime's offset restrictions. I believe after my work [1] from 5 years ago, implementing this should be a simple matter of removing a few if statements. I stopped short of the original goal mainly because I did not want to deal with the question of how datetimes with sub-minute offsets should be printed when the relevant standard only gives you four digits for the UTC offset. [2] [1]: http://bugs.python.org/issue5288 [2]: http://bugs.python.org/issue5288#msg109510 From chris.barker at noaa.gov Wed Aug 12 17:37:02 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 12 Aug 2015 08:37:02 -0700 Subject: [Datetime-SIG] Drop datetime's offset restrictions? In-Reply-To: References: Message-ID: <7905154221311431118@unknownmsgid> > Whaddya think? Continue nagging people for no reason at all, or let > freedom bloom? ;-) As long as we're poking around in datetime, I say let freedom bloom! -Chris From alexander.belopolsky at gmail.com Wed Aug 12 18:05:45 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 12 Aug 2015 12:05:45 -0400 Subject: [Datetime-SIG] Drop datetime's offset restrictions? In-Reply-To: <7905154221311431118@unknownmsgid> References: <7905154221311431118@unknownmsgid> Message-ID: On Wed, Aug 12, 2015 at 11:37 AM, Chris Barker - NOAA Federal wrote: >> Whaddya think? Continue nagging people for no reason at all, or let >> freedom bloom? ;-) > > As long as we're poking around in datetime, I say let freedom bloom! +0, but will be +1 if someone comes up with a specific proposal on how printing and parsing of extended precision timezones will work. See e.g., . From chris.barker at noaa.gov Wed Aug 12 22:44:05 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 12 Aug 2015 13:44:05 -0700 Subject: [Datetime-SIG] Drop datetime's offset restrictions? In-Reply-To: References: <7905154221311431118@unknownmsgid> Message-ID: On Wed, Aug 12, 2015 at 9:05 AM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > On Wed, Aug 12, 2015 at 11:37 AM, Chris Barker - NOAA Federal > wrote: > >> Whaddya think? Continue nagging people for no reason at all, or let > >> freedom bloom? ;-) > > > > As long as we're poking around in datetime, I say let freedom bloom! > > +0, but will be +1 if someone comes up with a specific proposal on how > printing and parsing of > extended precision timezones will work. > > See e.g., . > I'm a bit confused on what one has to do with the other? Or is this about the fact that ISO 8601 seems to specify that the offset be defined as: HH:MM so no room for seconds? Though it's seems the obvious extension is to simply add the seconds: HH:MM:SS Not that parsing of ISO 8601 is not a great idea to have in datetime. I've often wondered why it isn't there. BTW, numpy's datetime64 does have it, and it's more or less the standard way to create one. No idea what it does if you try to pass in an offset with seconds... -- yes, I do, not hard to check: In [54]: np.datetime64('2015-08-03T13:12:11-07:00:13') --------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () ----> 1 np.datetime64('2015-08-03T13:12:11-07:00:13') ValueError: Error parsing datetime string "2015-08-03T13:12:11-07:00:13" at position 25 The problem I see with ISO 8601 is that (AFAICT) there is no way to specify a time zone -- only a UTC offset. and offsets are not unique. So if someone wants to create a datetime object from an ISO 8601 string, they have two options: * The string does not contain an offset -- they should get a naive datetime * The string does contain an offset -- they have to get a datetime with a UTC tzinfo object, with the offset having been applied. Is there any issue here other than someone needs to find or write the code and persuade everyone that it's clean an maintainable enough for the stdlib? -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Wed Aug 12 22:51:01 2015 From: carl at oddbird.net (Carl Meyer) Date: Wed, 12 Aug 2015 14:51:01 -0600 Subject: [Datetime-SIG] Drop datetime's offset restrictions? In-Reply-To: References: <7905154221311431118@unknownmsgid> Message-ID: <55CBB1B5.5080205@oddbird.net> Tangent: On 08/12/2015 02:44 PM, Chris Barker wrote: [snip] > The problem I see with ISO 8601 is that (AFAICT) there is no way to > specify a time zone -- only a UTC offset. > > and offsets are not unique. > > So if someone wants to create a datetime object from an ISO 8601 string, > they have two options: > > * The string does not contain an offset -- they should get a naive datetime > > * The string does contain an offset -- they have to get a datetime with > a UTC tzinfo object, with the offset having been applied. No, they should get a datetime object matching the ISO datetime they passed in, with a FixedOffset tzinfo instance attached to it. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From chris.barker at noaa.gov Wed Aug 12 23:01:22 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 12 Aug 2015 14:01:22 -0700 Subject: [Datetime-SIG] Drop datetime's offset restrictions? In-Reply-To: <55CBB1B5.5080205@oddbird.net> References: <7905154221311431118@unknownmsgid> <55CBB1B5.5080205@oddbird.net> Message-ID: On Wed, Aug 12, 2015 at 1:51 PM, Carl Meyer wrote: > Tangent: > > > * The string does contain an offset -- they have to get a datetime with > > a UTC tzinfo object, with the offset having been applied. > > No, they should get a datetime object matching the ISO datetime they > passed in, with a FixedOffset tzinfo instance attached to it. indeed -- I'd forgotten how easy is was to create one of those out of the box. -Chris > > Carl > > > _______________________________________________ > Datetime-SIG mailing list > Datetime-SIG at python.org > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From mal at egenix.com Wed Aug 12 23:19:41 2015 From: mal at egenix.com (M.-A. Lemburg) Date: Wed, 12 Aug 2015 23:19:41 +0200 Subject: [Datetime-SIG] Drop datetime's offset restrictions? In-Reply-To: References: <7905154221311431118@unknownmsgid> Message-ID: <55CBB86D.70908@egenix.com> On 12.08.2015 22:44, Chris Barker wrote: > On Wed, Aug 12, 2015 at 9:05 AM, Alexander Belopolsky < > alexander.belopolsky at gmail.com> wrote: > >> On Wed, Aug 12, 2015 at 11:37 AM, Chris Barker - NOAA Federal >> wrote: >>>> Whaddya think? Continue nagging people for no reason at all, or let >>>> freedom bloom? ;-) >>> >>> As long as we're poking around in datetime, I say let freedom bloom! >> >> +0, but will be +1 if someone comes up with a specific proposal on how >> printing and parsing of >> extended precision timezones will work. >> >> See e.g., . >> > > I'm a bit confused on what one has to do with the other? > > Or is this about the fact that ISO 8601 seems to specify that the offset be > defined as: > > HH:MM > > so no room for seconds? Though it's seems the obvious extension is to > simply add the seconds: > > HH:MM:SS > > Not that parsing of ISO 8601 is not a great idea to have in datetime. I've > often wondered why it isn't there. I'd say: simply drop the seconds offset on output. It's better to conform to the standard than to break interoperability: https://en.wikipedia.org/wiki/ISO_8601#Time_offsets_from_UTC For those few cases where you'd need them, you can fiddle with the time specification before the UTC offset to have the end result still represent the correct point in date/time. > BTW, numpy's datetime64 does have it, and it's more or less the standard > way to create one. No idea what it does if you try to pass in an offset > with seconds... -- yes, I do, not hard to check: > > In [54]: np.datetime64('2015-08-03T13:12:11-07:00:13') > --------------------------------------------------------------------------- > ValueError Traceback (most recent call last) > in () > ----> 1 np.datetime64('2015-08-03T13:12:11-07:00:13') > > ValueError: Error parsing datetime string "2015-08-03T13:12:11-07:00:13" at > position 25 > > > The problem I see with ISO 8601 is that (AFAICT) there is no way to specify > a time zone -- only a UTC offset. > > and offsets are not unique. Using UTC offsets is the only sane way to write down a point in time without having to carry around Her Majesty's Nautical Almanac all the time :-) Seriously, timezone names are ambiguous and a rather poor way to spell a time, unless you have additional context to help you decipher the time zone, look up the regular UTC offset (they tend to change every so often), check for DST and then take it from there. The Olson database is an interesting read on this subject, e.g. ftp://ftp.iana.org/tz/data/europe > So if someone wants to create a datetime object from an ISO 8601 string, > they have two options: > > * The string does not contain an offset -- they should get a naive datetime > > * The string does contain an offset -- they have to get a datetime with a > UTC tzinfo object, with the offset having been applied. > > Is there any issue here other than someone needs to find or write the code > and persuade everyone that it's clean an maintainable enough for the stdlib? -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Aug 12 2015) >>> Python Projects, Coaching and Consulting ... http://www.egenix.com/ >>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/ ________________________________________________________________________ 2015-08-12: Released mxODBC 3.3.4 ... http://egenix.com/go80 2015-07-30: Released eGenix pyOpenSSL 0.13.11 ... http://egenix.com/go81 2015-08-22: FrOSCon 2015 ... 10 days to go eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ From tim.peters at gmail.com Thu Aug 13 04:01:04 2015 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 12 Aug 2015 21:01:04 -0500 Subject: [Datetime-SIG] Drop datetime's offset restrictions? In-Reply-To: <55CBB86D.70908@egenix.com> References: <7905154221311431118@unknownmsgid> <55CBB86D.70908@egenix.com> Message-ID: [M.-A. Lemburg] > ... > I'd say: simply drop the seconds offset on output. It's better to > conform to the standard than to break interoperability: > > https://en.wikipedia.org/wiki/ISO_8601#Time_offsets_from_UTC I'd say the opposite: there are no _current_ standard or DST rules, in the Olson database, that require sub-minute resolution. While there are many standard-offset rules in the database that do, they were all retired at least dozens of years ago, and many over a century ago. So if someone is using such a thing, they're doing something with long-gone historical times that presumably _requires_ knowing the sub-minute portion of the offset. Such uses would be ill-served by silently discarding this information. Python could support it fine, so such uses would be well served by extending the standard format in the obvious ways (to allow optional [[:]ss[.mmmmmm]] in offsets. Any timezone requiring such a thing would only be writable and readable in Python, but so it goes. If it were common enough to matter widely, 8601 would have catered to it to begin with. By producing a non-standard format in such cases, the user at least gets a good chance of getting yelled at loudly if they try to read such a format on a system that doesn't support what they're doing. > For those few cases where you'd need them, you can fiddle > with the time specification before the UTC offset to have > the end result still represent the correct point in > date/time. This pretty much requires that the user be a bona fide expert on the topic. Under the alternative, a user mucking with historical datetimes suffers no errors or surprises provided they stay entirely within Python. They only suffer a problem if they try to use a non-standard 8601-ish string on a platform that doesn't support it. That will drive them to do the obvious thing: stick with Python, which understands and loves them ;-) From alexander.belopolsky at gmail.com Sun Aug 16 02:49:01 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 15 Aug 2015 20:49:01 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement Message-ID: PEP 495 [1] is a deliberately minimalistic proposal to remove an ambiguity in representing some local times as datetime.datetime objects. The concept has been extensively discussed on python-ideas and this mailing list. I believe a consensus has been reached and it is reflected in the PEP. PEP 495 does not propose any changes to datetime/timedelta arithmetics, but it has been agreed that it is a necessary step for implementing the "strict" rules. [1]: https://www.python.org/dev/peps/pep-0495 From ethan at stoneleaf.us Sun Aug 16 04:58:34 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Sat, 15 Aug 2015 19:58:34 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: <55CFFC5A.8030208@stoneleaf.us> Correction to PEP: ---------------------------------- [about halfway down] The value returned by dt.timestamp() given a missing dt will be the larger of the two "nice to know" values if dt.first == True and the larger otherwise. Presumably the first "larger" should be "smaller". ---------------------------------- [about three/fourths of the way down] Temporal Arithmetics The value of "first" will be ignored in all operations except those that involve conversion between timezones. [2] As a consequence, datetime.datetime` or datetime.time instances ... "datetime.datetime" is missing a leading backtick. ---------------------------------- The PEP should be presented on py-dev for discussion (or at least for pronouncement). Are the strict tzinfo's for a different PEP? -- ~Ethan~ From alexander.belopolsky at gmail.com Sun Aug 16 05:17:19 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 15 Aug 2015 23:17:19 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: <55CFFC5A.8030208@stoneleaf.us> References: <55CFFC5A.8030208@stoneleaf.us> Message-ID: On Sat, Aug 15, 2015 at 10:58 PM, Ethan Furman wrote: > Correction to PEP: > > ---------------------------------- > [about halfway down] > The value returned by dt.timestamp() given a missing dt will be the larger > of the two "nice to know" values if dt.first == True and the larger > otherwise. > > Presumably the first "larger" should be "smaller". > ---------------------------------- No. The rule for the missing time is the opposite to that for the ambiguous time. This allows a program that probes the TZ database by calling timestamp with two different values of the "first" flag to avoid any additional calls to differentiate between the gap and the fold. > [about three/fourths of the way down] > Temporal Arithmetics > > The value of "first" will be ignored in all operations except those that > involve conversion between timezones. [2] As a consequence, > datetime.datetime` or datetime.time instances ... > > "datetime.datetime" is missing a leading backtick. > ---------------------------------- Will fix. Thanks. > > > The PEP should be presented on py-dev for discussion (or at least for > pronouncement). According to PEP 1: "PEP review and resolution may also occur on a list other than python-dev (for example, distutils-sig for packaging related PEPs that don't immediately affect the standard library). In this case, the "Discussions-To" heading in the PEP will identify the appropriate alternative list where discussion, review and pronouncement on the PEP will occur." [1] PEP 495 has "Discussions-To" heading set to this mailing list. I have no problem cross-posting to python-dev, but I am not sure this is the right thing to do. This SIG was created for a reason. > > Are the strict tzinfo's for a different PEP? Yes, PEP-0500. [1]: https://www.python.org/dev/peps/pep-0001/#pep-review-resolution From tim.peters at gmail.com Sun Aug 16 05:27:37 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 15 Aug 2015 22:27:37 -0500 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: <55CFFC5A.8030208@stoneleaf.us> Message-ID: [Ethan Furman] >> Correction to PEP: >> >> ---------------------------------- >> [about halfway down] >> The value returned by dt.timestamp() given a missing dt will be the larger >> of the two "nice to know" values if dt.first == True and the larger >> otherwise. >> >> Presumably the first "larger" should be "smaller". [Alexander] > No. The rule for the missing time is the opposite to that for the > ambiguous time. > This allows a program that probes the TZ database by calling timestamp with > two different values of the "first" flag to avoid any additional calls > to differentiate between the gap and the fold. Did you note that the original quote uses "larger" twice? will be the larger of the two if ... and the larger otherwise ^^^^^^ ^^^^^^ _One_ of them should surely say "smaller" instead ;-) Although I'd use "earlier" and "later". From alexander.belopolsky at gmail.com Sun Aug 16 05:55:35 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 15 Aug 2015 23:55:35 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: <55CFFC5A.8030208@stoneleaf.us> Message-ID: On Sat, Aug 15, 2015 at 11:27 PM, Tim Peters wrote: > Did you note that the original quote uses "larger" twice? > > will be the larger of the two if ... and the larger otherwise > ^^^^^^ ^^^^^^ Ah, the perils of copy-and-paste! Got it now. Will fix. Thanks! From alexander.belopolsky at gmail.com Sun Aug 16 06:02:02 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 16 Aug 2015 00:02:02 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: <55CFFC5A.8030208@stoneleaf.us> Message-ID: On Sat, Aug 15, 2015 at 11:27 PM, Tim Peters wrote: > Although I'd use "earlier" and "later" [timestamp value]. The timestamp value is a floating point number. As such, it can be larger or smaller, but not earlier or later. From alexander.belopolsky at gmail.com Sun Aug 16 06:10:16 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 16 Aug 2015 00:10:16 -0400 Subject: [Datetime-SIG] Datetime - my issues In-Reply-To: References: Message-ID: On Wed, Jul 29, 2015 at 2:41 PM, Skip Montanaro wrote: > Corollary: It seems to me like the astimezone call ought to work: See . From guido at python.org Sun Aug 16 16:12:00 2015 From: guido at python.org (Guido van Rossum) Date: Sun, 16 Aug 2015 17:12:00 +0300 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: <55CFFC5A.8030208@stoneleaf.us> Message-ID: On Sun, Aug 16, 2015 at 7:02 AM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > On Sat, Aug 15, 2015 at 11:27 PM, Tim Peters wrote: > > Although I'd use "earlier" and "later" [timestamp value]. > > The timestamp value is a floating point number. As such, it can be > larger or smaller, but not earlier or later. > But it represents time on a timeline in specific units. I agree with Tim that using earlier and later would be clearer to the reader. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sun Aug 16 20:34:51 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 16 Aug 2015 13:34:51 -0500 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: [Alexander Belopolsky] > PEP 495 [1] is a deliberately minimalistic proposal to remove an > ambiguity in representing some local times as datetime.datetime > objects. > ... Just noting that the "Guidelines for new tzinfo implementations" section uses the term "hour" when discussing DST transitions, which it defines as hour = timedelta(hours=1) The confusion here is that DST adjustments aren't always an hour. Running an ad hoc Python script over the Olson data files, I get this distribution of DST adjustments (the string value followed by the number of times it appears): '0' 910 '0:00' 1 '0:20' 2 '0:30' 20 '0:40' 1 '1' 3 # best I can tell, this means '1:00" (one hour) '1:00' 983 '2:00' 23 Almost all of these apply only to historical dates, but since we're trying to cater to things that don't matter anyway ... ;-) For another annoyance, it looks like at least one place in the world has more than one kind of DST during each year. Here from the discussion of Antarctica/Troll in the "antarctica" file: # I recently had a long dialog about this with the developer of timegenie.com. # In the absence of specific dates, he decided to choose some likely ones: # GMT +1 - From March 1 to the last Sunday in March # GMT +2 - From the last Sunday in March until the last Sunday in October # GMT +1 - From the last Sunday in October until November 7 # GMT +0 - From November 7 until March 1 So, top to bottom, the DST offset changes from 0 to 1, then from 1 to 2, then back to 1 again, and finally to 0. The rules for that are currently commented out, because the comments note that zic had to be changed (in 2014) to handle them correctly, and they don't want to uncomment the rules until the most recent zic is widely adopted. The current rules ignore the two periods with the +1 adjustment, so DST jumps between 0 and +2 in Antarctica/Troll. In any case, that means references to "zero" in the DST discussion are also too specific. If I were you, I'd leave the text alone, but add a footnote explaining that things may need to be adjusted in the obvious ways for timezones with oddball DST adjustments and/or more than one kind of DST per year. I don't see any problems here `first' doesn't handle in theory - these headaches belong to tzinfo implementers. From guido at python.org Sun Aug 16 21:23:59 2015 From: guido at python.org (Guido van Rossum) Date: Sun, 16 Aug 2015 22:23:59 +0300 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: I think that a courtesy message to python-dev is appropriate, with a link to the PEP and an invitation to discuss its merits on datetime-sig. Here are my thoughts about the PEP so far, and some nits about the text (sorry for hopping around in the text a bit, I tried to sort my bullets but I reviewed the PEP skipping back and forth a lot, so this reflects that): - I'm surprised the name of the proposed flag doesn't occur in the abstract. - The rationale might explicitly mention the two cases we're thinking about: DST transitions and adjustments to the timezone's base offset -- noting that the latter may be an arbitrary interval (not just an hour). - The sidebar doesn't show up as a sidebar, but as somewhat mysterious text, on https://www.python.org/dev/peps/pep-0495/ (it does on legacy.python.org, but we're trying to avoid that site). Maybe you should file a bug with the pydotorg project on GitHub (if you haven't already). (While I like the artwork, it's a bit un-PEP-like, and maybe not worth it given the problems making the image appear properly.) - Conversely, on legacy.python.org there are some error messages about "Unknown directive type "code"" (lines 112, 118). - "a fold is created in the fabric of time" sounds a bit like science-fiction. I'd just say "a time fold is created", or "a fold is created in time". - Despite having read the section about the naming, I'm still not wild about the name 'first'. This is in part because this requires True as the default, in part because without knowing the background its meaning somewhat mysterious. I'm not wild about the alternatives either, so perhaps this requires more bikeshedding. :-( (FWIW I agree that the name should not reference DST, since time folds may appear for other reasons.) Hmm... Maybe "fold=True" to select the second occurrance? - I'm a bit surprised that this flag doesn't have three values (e.g. None, True, False) -- in C, the tm_isdst flag in struct tm can be -1, 0 and 1, where -1 means "figure it out" or "don't care". The "don't care" case should allow stricter backward compatibility. - "[1] An instance that has first=False in a non-ambiguous case is said to represent an invalid time ..." Could you quickly elaborate here whether such an invalid time is considered an hour later than the valid corresponding time with first=True, given a reasonable timezone with and without DST? - "In CPython, a non-boolean value of first will raise a TypeError , but other implementations may allow the value None to behave the same as when first is not given." This is surprisingly lenient. Why allow the second behavior at all? (Surely all Python implementations can distinguish between a value equal to None and a missing value, even if some kind of hack is needed.) Also, why this clause for replace() but not for other methods? - I'm disappointed that there are now some APIs that explicitly treat a naive datetime as local (using the system timezone). I carefully avoided such interpretation in the original design, since a naive datetime can also be used to represent a point in UTC, or in some timezone that's implicit. But I guess this cat is out of the bag since it's already assumed by timestamp() and fromtimestamp(). :-( - "Conversion from POSIX seconds from EPOCH" I'd move this section before the opposite clause, since it is simpler and the other clause references fromtimestamp(). The behavior of fromtimestamp() may also be considered motivational for having only the values True and False for the flag. - "New guidelines will be published for implementing concrete timezones with variable UTC offset." Where? (Is this just a forward reference to the next section? Then I'd drop it.) - "... must follow these guidelines." Here "must" is very strong (it is the strongest word in "standards speak", stronger than "should", "ought to", "may"). I recommend "should", that's strong enough. Also, doesn't this apply to all tzinfo subclasses except fixed-offset ones? That would mean that any current tzinfo subclass is deemed non-compliant with this PEP. That feels too strong. Your use of "new subclasses" also suggests that this doesn't apply to existing subclasses, but that's not enough -- I think it should be reasonable for someone writing a tzinfo subclass to continue to ignore the "first" flag, and still be considered a valid tzinfo subclass, if they don't care about what happens in time folds. (Note that I have nothing against the guidelines themselves -- just against the implication that not following them makes a class in some way "non-compliant", which is a pretty strong damnation in standards speak.) - "We chose the minute byte to store the the "first" bit because this choice preserves the natural ordering." This only works with folds of exactly one hour. Also, is the natural ordering (of the pickles, apparently) used anywhere? I would hope not. Finally, given that two times that differ only in their 'first' flag compare equal, the natural ordering (if relevant :-) would be to store/compare the 'first' bit last. - Temporal Arithmetic (probably shouldn't have an "s" at the end): this probably needs some motivation. I think it's inevitable (since we don't know the size of the time fold), but it still feels weird. - "[2] As of Python 3.5, tzinfo is ignored whenever timedelta is added or subtracted ..." I don't see a reason for this footnoote to discuss possible future changes to datetime arithmetic; leave that up to the respective PEP. (OTOH you may have a specific follow-up PEP in mind, and it may be better to review this one in the light of the follow-up PEP.) - "This proposal will have little effect on the programs that do not read the first flag explicitly or use tzinfo implementations that do." This seems ambiguous -- if I use a tzinfo implementation that reads the first flag, am I affected or not? Also, "the programs" should be just "programs", and I'm kind of curious why the hedging of "little effect" (rather than "no effect") is needed. Also, you might give some examples of changes that programs that *do* use the first flag may experience. - In a reply to this thread, you wrote "The rule for the missing time is the opposite to that for the ambiguous time. This allows a program that probes the TZ database by calling timestamp with two different values of the "first" flag to avoid any additional calls to differentiate between the gap and the fold." Can you clarify this (I'm not sure how this works, though I intuitively agree that the two rules should be each other's opposite) and add it to the PEP? - Would there be any merit in proposing, together with the idea of a three-value flag, that datetime arithmetic should use "timeline arithmetic" if the flag is defined and a tzinfo is present? On Sun, Aug 16, 2015 at 3:49 AM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > PEP 495 [1] is a deliberately minimalistic proposal to remove an > ambiguity in representing some local times as datetime.datetime > objects. > > The concept has been extensively discussed on python-ideas and this > mailing list. I believe a consensus has been reached and it is > reflected in the PEP. > > PEP 495 does not propose any changes to datetime/timedelta > arithmetics, but it has been agreed that it is a necessary step for > implementing the "strict" rules. > > [1]: https://www.python.org/dev/peps/pep-0495 > _______________________________________________ > Datetime-SIG mailing list > Datetime-SIG at python.org > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sun Aug 16 21:41:00 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 16 Aug 2015 15:41:00 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: On Sun, Aug 16, 2015 at 2:34 PM, Tim Peters wrote: > Just noting that the "Guidelines for new tzinfo implementations" > section uses the term "hour" when discussing DST transitions, which it > defines as > > hour = timedelta(hours=1) > > The confusion here is that DST adjustments aren't always an hour. Yes, I was planning to change that to "delta". Note that in the reference implementation, I added a case using Lord Hope Island time zone that has a 30 min DST delta. I'll make sure Antarctica/Troll is supported correctly as well. > If I were you, I'd leave the text alone, but add a footnote explaining > that things may need to be adjusted in the obvious ways for timezones > with oddball DST adjustments and/or more than one kind of DST per > year. Makes sense. The US-style DST rules are confusing enough that we probably don't want to complicate the discussion. I'll add a footnote. From tim.peters at gmail.com Sun Aug 16 21:59:23 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 16 Aug 2015 14:59:23 -0500 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: [Guido] > ... > - In a reply to this thread, you wrote "The rule for the missing time is the > opposite to that for the ambiguous time. This allows a program that probes > the TZ database by calling timestamp with two different values of the > "first" flag to avoid any additional calls to differentiate between the gap > and the fold." Can you clarify this (I'm not sure how this works, though I > intuitively agree that the two rules should be each other's opposite) and > add it to the PEP? If ts_true is the timestamp with first=True, and ts_false the timestamp with first=False, then: - The original time was ambiguous (fold) iff ts_true < ts_false. - The original time was invalid (in a gap) iff ts_true > ts.false. - The original time was neither ambiguous nor invalid iff ts_true == ts._false. At least that was my guess ;-) I agree the PEP should spell it out regardless. From tim.peters at gmail.com Sun Aug 16 22:10:23 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 16 Aug 2015 15:10:23 -0500 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: [Guido] > ... > - "We chose the minute byte to store the the "first" bit because this choice > preserves the natural ordering." This only works with folds of exactly one > hour. Also, is the natural ordering (of the pickles, apparently) used > anywhere? I would hope not. Finally, given that two times that differ only > in their 'first' flag compare equal, the natural ordering (if relevant :-) > would be to store/compare the 'first' bit last. I don't know of any place comparing pickles. The in-memory representations are compared as bytestrings, e.g., like this in datetime_richcompare: if (GET_DT_TZINFO(self) == GET_DT_TZINFO(other)) { diff = memcmp(((PyDateTime_DateTime *)self)->data, ((PyDateTime_DateTime *)other)->data, _PyDateTime_DATETIME_DATASIZE); return diff_to_bool(diff, op); But, presumably, unpickling will move `first` out of whichever member it's abusing in the pickle into its own byte. There are lots of always-0-now bits in the pickle format, and I expect it doesn't matter to anything which of those bits is abused. From tim.peters at gmail.com Sun Aug 16 23:16:55 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 16 Aug 2015 16:16:55 -0500 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: [Guido] > ... > - Would there be any merit in proposing, together with the idea of a > three-value flag, that datetime arithmetic should use "timeline arithmetic" > if the flag is defined and a tzinfo is present? It's worth considering. At the highest level, this requires: 1. Answering whether "timeline arithmetic" does or does not account for gaps and folds due to leap seconds too. 2. Supplying a feasible path for those who insist the other answer is the only acceptable one ;-) The way things have gone so far, the current PEP was meant to be a "baby step" along the way, and PEP 500 goes on to refuse to even ask #1, instead addressing #2 (tzinfo objects will grow ways to take over datetime arithmetic, in any damn fool ;-) way they like). But if "timeline arithmetic" comes built in, #1 has to be answered up front - and I wouldn't be surprised then if PEP 500 died, leaving the #1 "losers" wholly devoid of hope. Which I take kinda seriously ;-) - It's nuts to add a minute to a UTC datetime and see the seconds change ("leap seconds are insane"). - It's also nuts to subtract two UTC datetimes a minute apart and not get the actual number of seconds between them in the real world ("leap seconds are vital"). The advantage of the current approach is that it leaves both camps equally empowered - and equally challenged - to scratch their own itches. That's the political answer. As always, I'll leave the tech stuff to you eggheads ;-) From alexander.belopolsky at gmail.com Sun Aug 16 23:45:20 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 16 Aug 2015 17:45:20 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: On Sun, Aug 16, 2015 at 3:23 PM, Guido van Rossum wrote: > I think that a courtesy message to python-dev is appropriate, with a link to > the PEP and an invitation to discuss its merits on datetime-sig. Will do. (Does anyone know how to set Reply-To: header in Gmail?) .. > - I'm surprised the name of the proposed flag doesn't occur in the abstract. > That's because I wanted people to get to the proposal section before starting to bikeshed on the name of the flag. More on that below. > - The rationale might explicitly mention the two cases we're thinking about: > DST transitions and adjustments to the timezone's base offset -- noting that > the latter may be an arbitrary interval (not just an hour). > Actually, in either case the adjustment can be a fraction of an hour. I'll add this to the rationale. > - The sidebar doesn't show up as a sidebar, but as somewhat mysterious text, > on https://www.python.org/dev/peps/pep-0495/ (it does on legacy.python.org, > but we're trying to avoid that site). Maybe you should file a bug with the > pydotorg project on GitHub (if you haven't already). I did: . > (While I like the > artwork, it's a bit un-PEP-like, and maybe not worth it given the problems > making the image appear properly.) If we don't fix the layout issues before the pronouncement, I'll remove the graphic. > - Conversely, on legacy.python.org there are some error messages about > "Unknown directive type "code"" (lines 112, 118). I'll look into this. I've never had problems with ReStructuredText rendering on docs.p.o, but the peps site seems to be more restrictive. > > - "a fold is created in the fabric of time" sounds a bit like > science-fiction. I'd just say "a time fold is created", or "a fold is > created in time". > Agree. After all, a "fold" already suggests some kind of fabric. > - Despite having read the section about the naming, I'm still not wild about > the name 'first'. This is in part because this requires True as the default, > in part because without knowing the background its meaning somewhat > mysterious. I agree. My top candidate is "repeated=False", but an invitation to bikeshed, , was not met with the usual enthusiasm. To defend the "True means earlier" choice, I would mention that it matches "isdst=1 means earlier" in the fold. > I'm not wild about the alternatives either, so perhaps this > requires more bikeshedding. :-( (FWIW I agree that the name should not > reference DST, since time folds may appear for other reasons.) Hmm... Maybe > "fold=True" to select the second occurrance? I really want something that disambiguates two times based on their most natural characteristics: do you want the earlier or the later of the two choice? Anything else, in my view would require additional knowledge. > > - I'm a bit surprised that this flag doesn't have three values (e.g. None, > True, False) -- in C, the tm_isdst flag in struct tm can be -1, 0 and 1, > where -1 means "figure it out" or "don't care". With the proposed functionality, one can easily implement any of the C-style isdst logic. The problem, however is that while most C libraries agree with in their treatment of 0 and 1, the behavior on tm_isdst=-1 ranges from bad to absurd. For example, the value returned by mktime in the ambiguous case may depend on the arguments passed to the previous call to mktime. > The "don't care" case should allow stricter backward compatibility. I am not sure we want to maintain the behavior described in (Calling timestamp() on a datetime object modifies the timestamp of a different datetime object.) > > - "[1] An instance that has first=False in a non-ambiguous case is said to > represent an invalid time ..." Could you quickly elaborate here whether such > an invalid time is considered an hour later than the valid corresponding > time with first=True, given a reasonable timezone with and without DST? Such an instance is just *invalid* as in "February 29, 2015." In a non-ambiguous case, first=False means "the second of one", which does not make sense. Such instances should never be produced except for a narrow purpose of probing the astimezone() or timestamp() to determine whether a given datetime is ambiguous or not. > > - "In CPython, a non-boolean value of first will raise a TypeError , but > other implementations may allow the value None to behave the same as when > first is not given." This is surprisingly lenient. Why allow the second > behavior at all? Because it is currently allowed for the other arguments of replace() in the pure python datetime implementation that we ship. I will be happy to change that starting with the "first". > (Surely all Python implementations can distinguish between > a value equal to None and a missing value, even if some kind of hack is > needed.) Also, why this clause for replace() but not for other methods? What other methods? replace() is fairly unique in its treatment of arguments. > > - I'm disappointed that there are now some APIs that explicitly treat a > naive datetime as local (using the system timezone). I carefully avoided > such interpretation in the original design, since a naive datetime can also > be used to represent a point in UTC, or in some timezone that's implicit. > But I guess this cat is out of the bag since it's already assumed by > timestamp() and fromtimestamp(). :-( I held that siege as long as I could. > > - "Conversion from POSIX seconds from EPOCH" I'd move this section before > the opposite clause, since it is simpler and the other clause references > fromtimestamp(). The behavior of fromtimestamp() may also be considered > motivational for having only the values True and False for the flag. > Will do. > - "New guidelines will be published for implementing concrete timezones with > variable UTC offset." Where? In the official datetime documentation. I'll clarify that. > (Is this just a forward reference to the next section? Then I'd drop it.) No, I expect that section to be incorporated in the official datetime library documentation. > > - "... must follow these guidelines." Here "must" is very strong (it is the > strongest word in "standards speak", stronger than "should", "ought to", > "may"). I recommend "should", that's strong enough. OK. This is a remnant of the idea to include a first-aware fromutc() implementation, which after some private discussions with Tim we decided to abandon. In light of that idea, "must" made sense as in "in order for unmodified fromutc() work correctly with your tzinfo implementation, it *must* ..." .. > - "We chose the minute byte to store the the "first" bit because this choice > preserves the natural ordering." This only works with folds of exactly one > hour. Also, is the natural ordering (of the pickles, apparently) used > anywhere? I would hope not. Finally, given that two times that differ only > in their 'first' flag compare equal, the natural ordering (if relevant :-) > would be to store/compare the 'first' bit last. > I'll remove the rationale. The ordering is a red herring anyways. I needs a place to stick one bit in the 10-byte payload and the minute byte looked like a natural place. I made up the ordering rational to a posteriori justify an arbitrary choice. > - Temporal Arithmetic (probably shouldn't have an "s" at the end): Wikipedia is of no help here: "Arithmetic or arithmetics (from the Greek ??????? arithmos, "number") ..." I'll check what we use in the library docs. (For some reason, I thought that Arithmetic is a branch of mathematic while arithmetics is a set of rules.) > this probably needs some motivation. I think it's inevitable (since we don't know > the size of the time fold), but it still feels weird. > It's what you say and backward compatibility considerations. We want existing programs to produce the same results even if they occasionally encounter first=False instances from say datetime.now(). I'll add a footnote. > - "[2] As of Python 3.5, tzinfo is ignored whenever timedelta is added or > subtracted ..." I don't see a reason for this footnoote to discuss possible > future changes to datetime arithmetic; leave that up to the respective PEP. I'll remove the discussion of the future changes to datetime arithmetic. > (OTOH you may have a specific follow-up PEP in mind, and it may be better to > review this one in the light of the follow-up PEP.) Yes, there is a PEP-0500, but it is nowhere as ready as this one. > - "This proposal will have little effect on the programs that do not read > the first flag explicitly or use tzinfo implementations that do." This seems > ambiguous -- if I use a tzinfo implementation that reads the first flag, am > I affected or not? Also, "the programs" should be just "programs", and I'm > kind of curious why the hedging of "little effect" (rather than "no effect") We are changing the behavior of datetime.timestamp on naive instances. This is really what the "hedging" is about. > is needed. Also, you might give some examples of changes that programs that > *do* use the first flag may experience. I don't understand. The programs that *do* use the first flag now experience an AttributeError, and that will surely change. Perhaps you want to see some examples of how the programs can start using the first flag? > > - In a reply to this thread, you wrote "The rule for the missing time is the > opposite to that for the ambiguous time. This allows a program that probes > the TZ database by calling timestamp with two different values of the > "first" flag to avoid any additional calls to differentiate between the gap > and the fold." Can you clarify this (I'm not sure how this works, though I > intuitively agree that the two rules should be each other's opposite) and > add it to the PEP? > Yes, I posted something like this before, but will include in the PEP. A first-aware program can do something like the following when it gets a naive instance dt that it wants to decorated with a timezone. dt1 = dt.replace(first=True).astimezone() dt2 = dt.replace(first=False).astimezone() if dt1 == dt2: return dt1 if dt1 < dt2: warn("ambiguous time: picked %s but it could be %s", dt1, dt2) return dt1 if dt1 > dt2: raise ValueError("invalid time", dt, dt1, dt2) > - Would there be any merit in proposing, together with the idea of a > three-value flag, that datetime arithmetic should use "timeline arithmetic" > if the flag is defined and a tzinfo is present? To add a third value, you will need a full additional bit anyways, so why not just have a separate flag that controls the choice of arithmetic and leave "first" a pure fold disambiguation flag? I consider the problem of local time disambiguation and that of the "timeline arithmetic" to be two orthogonal problems. Yes, "timeline arithmetic" can benefit from the first flag, but it is possible without it. Similarly, the problem of round-tripping the times between timezones can benefit from "timeline arithmetic", but PEP 495 solves it without introducing the new arithmetic. In my view PEP 495 solves a long-standing problem for which there is no adequate workaround within stdlib and third-party workarounds are cumbersome. The alternative datetime arithmetic PEP (PEP-0500) enables some nice to have features, but does not enable anything that cannot be achieved by other means. I would like to avoid mixing the two proposals. From guido at python.org Mon Aug 17 04:29:54 2015 From: guido at python.org (Guido van Rossum) Date: Sun, 16 Aug 2015 19:29:54 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: On Sun, Aug 16, 2015 at 2:16 PM, Tim Peters wrote: > [Guido] > > ... > > - Would there be any merit in proposing, together with the idea of a > > three-value flag, that datetime arithmetic should use "timeline > arithmetic" > > if the flag is defined and a tzinfo is present? > > It's worth considering. At the highest level, this requires: > > 1. Answering whether "timeline arithmetic" does or does not account > for gaps and folds due to leap seconds too. > > 2. Supplying a feasible path for those who insist the other answer is > the only acceptable one ;-) > > The way things have gone so far, the current PEP was meant to be a > "baby step" along the way, and PEP 500 goes on to refuse to even ask > #1, instead addressing #2 (tzinfo objects will grow ways to take over > datetime arithmetic, in any damn fool ;-) way they like). > > But if "timeline arithmetic" comes built in, #1 has to be answered up > front - and I wouldn't be surprised then if PEP 500 died, leaving the > #1 "losers" wholly devoid of hope. > > Which I take kinda seriously ;-) > > - It's nuts to add a minute to a UTC datetime and see the seconds > change ("leap seconds are insane"). > > - It's also nuts to subtract two UTC datetimes a minute apart and not > get the actual number of seconds between them in the real world ("leap > seconds are vital"). > > The advantage of the current approach is that it leaves both camps > equally empowered - and equally challenged - to scratch their own > itches. > > That's the political answer. As always, I'll leave the tech stuff to > you eggheads ;-) > How did we end up bending over this far backwards for leap seconds? To me, we're talking about a mapping to POSIX timestamps, which use a straightforward algorithm to map compute the date and time -- in particular, divmod(ts, 86400) will give the day number and the second within that day. The day gets converted to a date using standard calendar math (assuming they eventually fix the standard so that 2100 is not considered a leap year :-) and the time gets converted to HH:MM:SS using even simpler calculations. There's no room for leap seconds there. It's important to me that if two different Python implementations, running on two different computers, convert a POSIX timestamp to a date+time they get the same result. This is *much* more important to me than the idea that if two computers simultaneously call time.time() they get the same value -- there is simply no such thing as "simultaneously" (imagine one of the computers is on a rocket traveling to the moon). If you care about leap seconds you should use a different time source, and you shouldn't be using either the time module or the datetime module. They are inextricably linked. So there's my answer to #1. You may consider this a Pronouncement if you wish. It should not come as any surprise. And while I'm at it, I don't think PEP 500 is the answer. If you really want the number of real-world seconds between two datetime values you can write your own difftime() function that consults a leap seconds database. As for how to request timeline arithmetic vs. the other kind ("human"? I forget where our glossary is), that could be done by special-casing in the datetime class using some property (yet to be determined) of the tzinfo subclass or instance; or it could be done using different timedelta-ish classes. PEP 500 seems overly general just so it can also support the leap second case. ("So how do I write a real-time stock trading system", you may ask. Good question. Ask the stock exchanges. Their solution was not to trade near the leap second. Given that they probably have to deal with a mix of languages including at least Java, Cobol, Lisp, Python, and Smalltalk, I'm doubtful that they'll do better during the lifetime of Python 3. Famous last words perhaps.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Mon Aug 17 04:37:53 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 16 Aug 2015 22:37:53 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement Message-ID: On Sun, Aug 16, 2015 at 10:29 PM, Guido van Rossum wrote: > How did we end up bending over this far backwards for leap seconds? That's why I think February 29, 1900 may be a better selling point for PEP-0500 than the 36 (and counting) leap seconds. From alexander.belopolsky at gmail.com Mon Aug 17 05:02:36 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 16 Aug 2015 23:02:36 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: On Sun, Aug 16, 2015 at 10:29 PM, Guido van Rossum wrote: > To me, we're talking about a mapping to POSIX timestamps, which use a > straightforward algorithm to map compute the date and time -- in particular, > divmod(ts, 86400) will give the day number and the second within that day. > The day gets converted to a date using standard calendar math (assuming they > eventually fix the standard so that 2100 is not considered a leap year :-) > and the time gets converted to HH:MM:SS using even simpler calculations. > There's no room for leap seconds there. In PEP-0500, I don't talk about POSIX timestamps at all. To me, POSIX timestamps are just a compressed encoding for YYYY-MM-DD hh-mm-ss names of various points in time. The encoding is so efficient that there is no room between 2015-06-30 23:59:59 and 2015-05-01 00:00:00. Yes, it is very convenient that in most cases naive integer operations on POSIX timestamps translate to real world time time operations, but this is the same as the convenience of having 'a' + 1 produce 'b' in C. Once you face the realities of Unicode, you have to give up this convenience. If PEP 495 is accepted, then unlike POSIX timestamps, datetime instances will have enough redundancy to encode 2015-06-30 23:59:60 (as the second 2015-06-30 23:59:59) and not step over any otherwise valid time. Whether or not we should bless such abuse of the first flag, is a separate question, but the possibility will be there for the third party libraries to explore. From alexander.belopolsky at gmail.com Mon Aug 17 05:03:50 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sun, 16 Aug 2015 23:03:50 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: On Sun, Aug 16, 2015 at 11:02 PM, Alexander Belopolsky wrote: > 2015-06-30 23:59:59 and 2015-05-01 00:00:00. Should have been "2015-06-30 23:59:59 and 2015-07-01 00:00:00" of course. From guido at python.org Mon Aug 17 05:18:26 2015 From: guido at python.org (Guido van Rossum) Date: Sun, 16 Aug 2015 20:18:26 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: Thanks for the quick response! On Sun, Aug 16, 2015 at 2:45 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > On Sun, Aug 16, 2015 at 3:23 PM, Guido van Rossum > wrote: > > I think that a courtesy message to python-dev is appropriate, with a > link to > > the PEP and an invitation to discuss its merits on datetime-sig. > > Will do. (Does anyone know how to set Reply-To: header in Gmail?) > I think you can set TO: datetime-sig, BCC: python-dev. > .. > > - I'm surprised the name of the proposed flag doesn't occur in the > abstract. > > > > That's because I wanted people to get to the proposal section before > starting to bikeshed on the name of the flag. More on that below. > Heh. :-) > > - The rationale might explicitly mention the two cases we're thinking > about: > > DST transitions and adjustments to the timezone's base offset -- noting > that > > the latter may be an arbitrary interval (not just an hour). > > > > Actually, in either case the adjustment can be a fraction of an hour. > I'll add this to the rationale. > > > - The sidebar doesn't show up as a sidebar, but as somewhat mysterious > text, > > on https://www.python.org/dev/peps/pep-0495/ (it does on > legacy.python.org, > > but we're trying to avoid that site). Maybe you should file a bug with > the > > pydotorg project on GitHub (if you haven't already). > > I did: . > > > (While I like the > > artwork, it's a bit un-PEP-like, and maybe not worth it given the > problems > > making the image appear properly.) > > If we don't fix the layout issues before the pronouncement, I'll > remove the graphic. > > > - Conversely, on legacy.python.org there are some error messages about > > "Unknown directive type "code"" (lines 112, 118). > > I'll look into this. I've never had problems with ReStructuredText > rendering on docs.p.o, but the peps site seems to be more restrictive. > FWIW I don't get errors when I do "make pep-0498.html" in the peps repo -- I consider that the ultimate arbiter of who's right. Maybe we have an old ReST version generating legacy? > > > > - "a fold is created in the fabric of time" sounds a bit like > > science-fiction. I'd just say "a time fold is created", or "a fold is > > created in time". > > > > Agree. After all, a "fold" already suggests some kind of fabric. > :-) Never thought about it this way. I've always just considered it an excessively literary phrase. Can't you fold a line though? > > - Despite having read the section about the naming, I'm still not wild > about > > the name 'first'. This is in part because this requires True as the > default, > > in part because without knowing the background its meaning somewhat > > mysterious. > > I agree. My top candidate is "repeated=False", but an invitation to > bikeshed, < > https://mail.python.org/pipermail/datetime-sig/2015-August/000241.html>, > was not met with the usual enthusiasm. Actually the *usual* enthusiasm is probably expressed by more bikeshedding. :-) In this case I have to agree that "repeated" doesn't sound right. > To defend the "True means > earlier" choice, I would mention that it matches "isdst=1 means > earlier" in the fold. > But nobody would be able to remember that mnemonic -- the far majority of people simply don't know whether to move the clock forward or back when DST begins or ends, they just read it in the newspaper the day before (or rely on their cell phone) and try to forget about it as soon as they can. At least, that's how I usually do it (even though I am well capable of reasoning it through from first principles, it's not wort remembering). > > I'm not wild about the alternatives either, so perhaps this > > requires more bikeshedding. :-( (FWIW I agree that the name should not > > reference DST, since time folds may appear for other reasons.) Hmm... > Maybe > > "fold=True" to select the second occurrance? > > I really want something that disambiguates two times based on their > most natural characteristics: do you want the earlier or the later of > the two choice? Anything else, in my view would require additional > knowledge. > Agreed. > - I'm a bit surprised that this flag doesn't have three values (e.g. None, > > True, False) -- in C, the tm_isdst flag in struct tm can be -1, 0 and 1, > > where -1 means "figure it out" or "don't care". > > With the proposed functionality, one can easily implement any of the > C-style isdst logic. Really? The way I interpret the PEP, there's no way to represent the "-1" case using a datetime alone. > The problem, however is that while most C > libraries agree with in their treatment of 0 and 1, the behavior on > tm_isdst=-1 ranges from bad to absurd. For example, the value > returned by mktime in the ambiguous case may depend on the arguments > passed to the previous call to mktime. > The actual behavior and bugs of C libraries don't interest me much. I just care about "None" meaning "nobody set it, probably because the code was written before this flag was introduced". > > The "don't care" case should allow stricter backward compatibility. > > I am not sure we want to maintain the behavior described in > (Calling timestamp() on a datetime > object modifies the timestamp of a different datetime object.) > I can't quite follow the bug. Does it imply that datetime objects are mutable? Or is there some global state that's set by the timestamp() function? What is it that ts1.timestamp() changes that affects ts2.timestamp()? Anyway, I'm not saying we should maintain backwards compatibility in that case (assuming you can convince me it's a bug that should be fixed regardless of whether we accept this PEP). > - "[1] An instance that has first=False in a non-ambiguous case is said to > > represent an invalid time ..." Could you quickly elaborate here whether > such > > an invalid time is considered an hour later than the valid corresponding > > time with first=True, given a reasonable timezone with and without DST? > > Such an instance is just *invalid* as in "February 29, 2015." In a > non-ambiguous case, first=False means "the second of one", which does > not make sense. Such instances should never be produced except for a > narrow purpose of probing the astimezone() or timestamp() to determine > whether a given datetime is ambiguous or not. > Yeah, but it can still be created -- and if I have one, how does it behave? (I don't care what it means. :-) > - "In CPython, a non-boolean value of first will raise a TypeError , but > > other implementations may allow the value None to behave the same as when > > first is not given." This is surprisingly lenient. Why allow the second > > behavior at all? > > Because it is currently allowed for the other arguments of replace() > in the pure python datetime implementation that we ship. I will be > happy to change that starting with the "first". > OK, seems to make sense to be consistent with the other args -- just explain that reason in the text then. > > (Surely all Python implementations can distinguish between > > a value equal to None and a missing value, even if some kind of hack is > > needed.) Also, why this clause for replace() but not for other methods? > > What other methods? replace() is fairly unique in its treatment of > arguments. > Well, several other methods also have a first=... argument. How should they treat first=None compared to its absence? > - I'm disappointed that there are now some APIs that explicitly treat a > > naive datetime as local (using the system timezone). I carefully avoided > > such interpretation in the original design, since a naive datetime can > also > > be used to represent a point in UTC, or in some timezone that's implicit. > > But I guess this cat is out of the bag since it's already assumed by > > timestamp() and fromtimestamp(). :-( > > I held that siege as long as I could. > And thanks for that! I guess we move on now. > - "Conversion from POSIX seconds from EPOCH" I'd move this section before > > the opposite clause, since it is simpler and the other clause references > > fromtimestamp(). The behavior of fromtimestamp() may also be considered > > motivational for having only the values True and False for the flag. > > > > Will do. > > > - "New guidelines will be published for implementing concrete timezones > with > > variable UTC offset." Where? > > In the official datetime documentation. I'll clarify that. > > > (Is this just a forward reference to the next section? Then I'd drop it.) > > No, I expect that section to be incorporated in the official datetime > library documentation. > OK, should definitely be clarified. Note that PEPs rarely say anything about the docs -- the docs simply follow the specs laid out by the PEP. So the PEP could just state the guidelines. (After all the guidelines can always be viewed in the context of the PEP.) > - "... must follow these guidelines." Here "must" is very strong (it is > the > > strongest word in "standards speak", stronger than "should", "ought to", > > "may"). I recommend "should", that's strong enough. > > OK. This is a remnant of the idea to include a first-aware fromutc() > implementation, which after some private discussions with Tim we > decided to abandon. In light of that idea, "must" made sense as in > "in order for unmodified fromutc() work correctly with your tzinfo > implementation, it *must* ..." > > .. > > - "We chose the minute byte to store the the "first" bit because this > choice > > preserves the natural ordering." This only works with folds of exactly > one > > hour. Also, is the natural ordering (of the pickles, apparently) used > > anywhere? I would hope not. Finally, given that two times that differ > only > > in their 'first' flag compare equal, the natural ordering (if relevant > :-) > > would be to store/compare the 'first' bit last. > > > > I'll remove the rationale. The ordering is a red herring anyways. I > needs a place to stick one bit in the 10-byte payload and the minute > byte looked like a natural place. I made up the ordering rational to > a posteriori justify an arbitrary choice. > > > > - Temporal Arithmetic (probably shouldn't have an "s" at the end): > > Wikipedia is of no help here: "Arithmetic or arithmetics (from the > Greek ??????? arithmos, "number") ..." I'll check what we use in the > library docs. (For some reason, I thought that Arithmetic is a branch > of mathematic while arithmetics is a set of rules.) > Maybe it's British vs. American usage? The Brits also say "maths" while Americans say "math". But I don't think I've ever seen or heard arithmetics with an 's', and I've seen and heard plenty of 'maths'. Anyway, we tend to use American spelling in PEPs. > > this probably needs some motivation. I think it's inevitable (since we > don't know > > the size of the time fold), but it still feels weird. > > It's what you say and backward compatibility considerations. We want > existing programs to produce the same results even if they > occasionally encounter first=False instances from say datetime.now(). > I'll add a footnote. > > > - "[2] As of Python 3.5, tzinfo is ignored whenever timedelta is added or > > subtracted ..." I don't see a reason for this footnoote to discuss > possible > > future changes to datetime arithmetic; leave that up to the respective > PEP. > > I'll remove the discussion of the future changes to datetime arithmetic. > > > (OTOH you may have a specific follow-up PEP in mind, and it may be > better to > > review this one in the light of the follow-up PEP.) > > Yes, there is a PEP-0500, but it is nowhere as ready as this one. > I think it's overkill (see my previous message). > > - "This proposal will have little effect on the programs that do not read > > the first flag explicitly or use tzinfo implementations that do." This > seems > > ambiguous -- if I use a tzinfo implementation that reads the first flag, > am > > I affected or not? Also, "the programs" should be just "programs", and > I'm > > kind of curious why the hedging of "little effect" (rather than "no > effect") > > We are changing the behavior of datetime.timestamp on naive instances. > This is really what the "hedging" is about. > OK. Might be good to be explicit in the text. But you haven't responded to my complaint that the "or" clause you used is ambiguous in English -- which of the following does it mean? - (not "read first flag" or "use tzinfo impls that do") - not ("read first flag" or "use tzinfo impls that do") > > is needed. Also, you might give some examples of changes that programs > that > > *do* use the first flag may experience. > > I don't understand. The programs that *do* use the first flag now > experience an AttributeError, and that will surely change. Perhaps > you want to see some examples of how the programs can start using the > first flag? > I was thinking about what happens in a program that explicitly uses the flag to create a datetime object and then passes it on to some library code that doesn't know about the flag. But the change in behavior of fromtimestamp() makes this a moot point. Better be explicit about the hedging. > > > > - In a reply to this thread, you wrote "The rule for the missing time is > the > > opposite to that for the ambiguous time. This allows a program that > probes > > the TZ database by calling timestamp with two different values of the > > "first" flag to avoid any additional calls to differentiate between the > gap > > and the fold." Can you clarify this (I'm not sure how this works, though > I > > intuitively agree that the two rules should be each other's opposite) and > > add it to the PEP? > > > > Yes, I posted something like this before, but will include in the PEP. > A first-aware program can do something like the following when it gets > a naive instance dt that it wants to decorated with a timezone. > > dt1 = dt.replace(first=True).astimezone() > dt2 = dt.replace(first=False).astimezone() > > if dt1 == dt2: > return dt1 > > if dt1 < dt2: > warn("ambiguous time: picked %s but it could be %s", dt1, dt2) > return dt1 > > if dt1 > dt2: > raise ValueError("invalid time", dt, dt1, dt2) > > > > - Would there be any merit in proposing, together with the idea of a > > three-value flag, that datetime arithmetic should use "timeline > arithmetic" > > if the flag is defined and a tzinfo is present? > > To add a third value, you will need a full additional bit anyways, so > why not just have a separate flag that controls the choice of > arithmetic and leave "first" a pure fold disambiguation flag? I > consider the problem of local time disambiguation and that of the > "timeline arithmetic" to be two orthogonal problems. Yes, "timeline > arithmetic" can benefit from the first flag, but it is possible > without it. Similarly, the problem of round-tripping the times > between timezones can benefit from "timeline arithmetic", but PEP 495 > solves it without introducing the new arithmetic. > Fair enough. The PEP could use some discussion of this topic! > In my view PEP 495 solves a long-standing problem for which there is > no adequate workaround within stdlib and third-party workarounds are > cumbersome. The alternative datetime arithmetic PEP (PEP-0500) > enables some nice to have features, but does not enable anything that > cannot be achieved by other means. I would like to avoid mixing the > two proposals. > Agreed, and I am very glad to see PEP 495, as a concrete proposal for the "first step" that I proposed a while ago. Still, the whole term "first step" implies there will be more steps, and we should make sure the first step is roughly in the right direction! -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Aug 17 05:19:55 2015 From: guido at python.org (Guido van Rossum) Date: Sun, 16 Aug 2015 20:19:55 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: On Sun, Aug 16, 2015 at 7:37 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > On Sun, Aug 16, 2015 at 10:29 PM, Guido van Rossum > wrote: > > How did we end up bending over this far backwards for leap seconds? > > That's why I think February 29, 1900 may be a better selling point for > PEP-0500 than the 36 (and counting) leap seconds. > I'm surely missing a joke here. Is this just the problem with the POSIX standard disagreeing with Pope Gregory XIII? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Aug 17 05:25:38 2015 From: guido at python.org (Guido van Rossum) Date: Sun, 16 Aug 2015 20:25:38 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: On Sun, Aug 16, 2015 at 8:02 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > On Sun, Aug 16, 2015 at 10:29 PM, Guido van Rossum > wrote: > > To me, we're talking about a mapping to POSIX timestamps, which use a > > straightforward algorithm to map compute the date and time -- in > particular, > > divmod(ts, 86400) will give the day number and the second within that > day. > > The day gets converted to a date using standard calendar math (assuming > they > > eventually fix the standard so that 2100 is not considered a leap year > :-) > > and the time gets converted to HH:MM:SS using even simpler calculations. > > There's no room for leap seconds there. > > In PEP-0500, I don't talk about POSIX timestamps at all. To me, POSIX > timestamps are just a compressed encoding for YYYY-MM-DD hh-mm-ss > names of various points in time. The encoding is so efficient that > there is no room between 2015-06-30 23:59:59 and 2015-05-01 00:00:00. > Yes, it is very convenient that in most cases naive integer operations > on POSIX timestamps translate to real world time time operations, but > this is the same as the convenience of having 'a' + 1 produce 'b' in > C. Once you face the realities of Unicode, you have to give up this > convenience. > > If PEP 495 is accepted, then unlike POSIX timestamps, datetime > instances will have enough redundancy to encode 2015-06-30 23:59:60 > (as the second 2015-06-30 23:59:59) and not step over any otherwise > valid time. Whether or not we should bless such abuse of the first > flag, is a separate question, but the possibility will be there for > the third party libraries to explore. > Good point, but I do have an opinion here, which is that we should not bless such abuse. There are other interpretations and uses of POSIX timestamps, and certainly the time module promises that (except when the clock is adjusted by root) time.time() ticks monotonically and measures (approximate, but close) real seconds. This is the reason for Google's "smearing" of leap seconds -- interval timers and the like based on the system clock continue to work. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Aug 17 06:10:13 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 16 Aug 2015 23:10:13 -0500 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: [Guido] > How did we end up bending over this far backwards for leap seconds? Eh - I don't see that we are. The `first` flag is applicable to any source of folds and gaps, provided the folds aren't worse than 2-to-1. Leap seconds are just one more case of that. If you want to build in gap-and-fold-aware arithmetic, then it seems only one kind can be supported. If it's left to tzinfo implementers, then it's up to them. > To me, we're talking about a mapping to POSIX timestamps, which use a > straightforward algorithm to map compute the date and time -- in particular, > divmod(ts, 86400) will give the day number and the second within that day. > The day gets converted to a date using standard calendar math (assuming they > eventually fix the standard so that 2100 is not considered a leap year :-) > and the time gets converted to HH:MM:SS using even simpler calculations. > There's no room for leap seconds there. Nobody is proposing to change any of that. A POSIX timestamp maps to the same UTC time in any case (including in cases where a POSIX timestamp is ambiguous, and including cases that produce a UTC time that never existed (hasn't happened yet, but will when a leap second gets removed someday)). None of that _precludes_ implementing an arithmetic that returns correct real-life deltas (between real-life UTC times) as SI-second durations. That can be done in most (but not all) cases using POSIX timestamps by consulting a table of leap second adjustments. It _could_ be done in all cases (not just most) using first-aware datetimes, because - unlike bare POSIX timestamps - the `first` flag gives datetimes a calendar notation that disambiguates the ambiguous POSIX timestamps. Do note that POSIX supports a calendar notation for leap seconds too (allowing tm_sec to be 60). _Given_ a `first`-like flag, nothing beyond that is really required from Python (although it would be more useful if a way were given to map a second value of "60" to/from first=False when applicable). BTW, POSIX fixed the "2100 is not a leap year" problem in 2001 - but ancient Internet rants never die ;-) > It's important to me that if two different Python implementations, running > on two different computers, convert a POSIX timestamp to a date+time they > get the same result. That's important to everyone. It would remain true even if someone did write an implementation of leap-second-aware arithmetic. > This is *much* more important to me than the idea that if two computers > simultaneously call time.time() they get the same value -- there is simply > no such thing as "simultaneously" (imagine one of the computers is on > a rocket traveling to the moon). One factoid I learned recently: the best atomic clocks now are so bloody sensitive that they run at detectably different rates if their altitude changes by an inch (due to gravitational time dilation). Luckily, nobody yet has demanded Python support relativistic datetime conversions ;-) > If you care about leap seconds you should use a different time source, and > you shouldn't be using either the time module or the datetime module. They > are inextricably linked. Eh - it's a shallow problem. Just tedious. Adding `first` is a crucial part of the battle for _all_ kinds of gap-and-fold-aware arithmetic. And the code for all kinds of the latter is a tedious but conceptually trivial chore. Accounting for clocks jumping around is no harder "theoretically" when the jump is caused by a leap second than when it's caused by daylight time starting or ending. > So there's my answer to #1. You may consider this a Pronouncement if you > wish. It should not come as any surprise. You don't want leap-aware arithmetic in the core. Neither do I. The remaining question is whether you want to make it impossible for someone else to add it. > And while I'm at it, I don't think PEP 500 is the answer. If you really want > the number of real-world seconds between two datetime values you can write > your own difftime() function that consults a leap seconds database. That's a start. Related questions include "the nuclear reactor has to be vented exactly 3600 SI seconds from now - what will the local clock say then?". I agree such apps "should be" using TAI. But the one actual scientist who has posted here most often says they don't get a choice about the time system used by the data they need to analyze. I say they should convert all the data they're given to TAI. They don't want to hear that ;-) > As for how to request timeline arithmetic vs. the other kind ("human"? I > forget where our glossary is), that could be done by special-casing in the > datetime class using some property (yet to be determined) of the tzinfo > subclass or instance; or it could be done using different timedelta-ish > classes. PEP 500 seems overly general just so it can also support the leap > second case. I've mentioned this before: I think it's insane to try to implement "human arithmetic" by overloading arithmetic operators. See the iCalendar spec for the least people expect now. Things like "the first Tuesday after the first Monday in November every 4 years" (US presidential election dates) are just the start. Trying to spell all that with combinations of +-*/% would be a write-only nightmare. dateutil implements the full iCalendar RRULE spec, and sanely uses functions with lots of keyword arguments. Nobody is _really_ going to improve on that. So I don't see PEP 500 as aiming at human arithmetic at all (although others may). I do see it as a way to: 1. Keep _all_ gap-and-fold timeline arithmetics out of the core. I don''t really want, e.g., DST-aware timeline arithmetic in the core either. 2. Let tzinfo implementers decide which kinds of gaps and folds they care about. > ("So how do I write a real-time stock trading system", you may ask. Good > question. Ask the stock exchanges. Their solution was not to trade near the > leap second. Given that they probably have to deal with a mix of languages > including at least Java, Cobol, Lisp, Python, and Smalltalk, I'm doubtful > that they'll do better during the lifetime of Python 3. Famous last words > perhaps.) Most shut down for at least an hour around the last leap second, because a leap second gets inserted just as the hour (on clocks running at a multiple-of-hour offset from UTC) is about to change. Experience taught them last time around that software confusions quickly propagate across fields :-( But that's a different problem. Python isn't the _source_ of anyone's notion of time. CPython inherits all it can know about "what time is it now?" from the OS and platform C libraries. From lrekucki at gmail.com Mon Aug 17 12:23:30 2015 From: lrekucki at gmail.com (=?UTF-8?Q?=C5=81ukasz_Rekucki?=) Date: Mon, 17 Aug 2015 12:23:30 +0200 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: On Sunday, August 16, 2015, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > PEP 495 [1] is a deliberately minimalistic proposal to remove an > ambiguity in representing some local times as datetime.datetime > objects. I'm quite confused - I though that all this discussion about terminology ended with agreement that the current datetime object is a good representation of naive time (aka LocalDateTime in other libraries). That time is never ambiguous because it has no timezone to begin with. I'm also not sure what purpose adding this flag to time object has. I know it has a tzinfo, but it's obviously impossible to have a dst aware implementation working with that. > > The concept has been extensively discussed on python-ideas and this > mailing list. I believe a consensus has been reached and it is > reflected in the PEP. A middle of summer vacations is not the best time to reach a consensus on anything ;) I'm not asking for special treatment, but as Python 3.6 is months away, maybe this could wait a week or two more. > > PEP 495 does not propose any changes to datetime/timedelta > arithmetics, but it has been agreed that it is a necessary step for > implementing the "strict" rules. Without a clear view what are the next steps with this approach, I think it might harm the final solution because it introduces even more things to be backward compatible about. ---- I would like to propose an alternate solution. Instead of a flag, I propose to add a "zone offset" property. This would be a read-only property that is assigned by tzinfo when converting to that timezone (and possibly when doing arithmetic). This solves the problem of overlapping moments, doesn't depend on the gap being 60 minutes or having external knowledge about it and allows for simple convention to UTC. Instead of adding more arguments to constructor and replace(), I propose add two methods to datetime: * earlier_when_overlap() * later_when_overlap() That would delegate to tzinfo the task to produce a datetime instance that matches the requested moment in time. When parsing strings, earlier would be always chosen and user can adjust to their preference. For all current implementations those would be a no-ops. I hope this actually makes any sense to you. I don't have such great writing skills to produce a PEP so quickly (especially on a phone). If no, just ignore this message ;) Best regards, ?ukasz > [1]: https://www.python.org/dev/peps/pep-0495 > _______________________________________________ > Datetime-SIG mailing list > Datetime-SIG at python.org > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > -- ?ukasz Rekucki -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Mon Aug 17 17:50:22 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 17 Aug 2015 11:50:22 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: On Sun, Aug 16, 2015 at 11:19 PM, Guido van Rossum wrote: > On Sun, Aug 16, 2015 at 7:37 PM, Alexander Belopolsky > wrote: >> >> On Sun, Aug 16, 2015 at 10:29 PM, Guido van Rossum >> wrote: >> > How did we end up bending over this far backwards for leap seconds? >> >> That's why I think February 29, 1900 may be a better selling point for >> PEP-0500 than the 36 (and counting) leap seconds. > > > I'm surely missing a joke here. Is this just the problem with the POSIX > standard disagreeing with Pope Gregory XIII? No, the problem is that both POSIX and Pope Gregory XIII disagree with Microsoft. https://support.microsoft.com/en-us/kb/214326 From chris.barker at noaa.gov Mon Aug 17 18:44:50 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 17 Aug 2015 09:44:50 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: Alexander, Great stuff -- thanks for takign the time to write all this up so clearly -- for what really is a tiny change! ( and very good that that is the only thing this PEP is covering) My one comment: It seems that a "missing" time really should be an Error -- like Feb 29th in a non-leap year. However, in the PEP: """ When a datetime.datetime instance dt represents a missing time, there is no value s for which: datetime.fromtimestamp(s) == dt but we can form two "nice to know" values of s that differ by the size of the gap in seconds. One is the value of s that would correspond to dt in a timezone where the UTC offset is always the same as the offset right before the gap and the other is the similar value but in a timezone the UTC offset is always the same as the offset right after the gap. The value returned by dt.timestamp() given a missing dt will be the larger of the two "nice to know" values if dt.first == True and the smaller otherwise. """ I _think_ I recall form this discussion that this was done, rather than an error, because it was decided that such calls should never raise an exception (also, the replace() functionality makes it all too likely). So it's probably best to do it in the proposed way, but the PEP should make it clear that a prime motivation is to avoid Exceptions in such code. -Chris On Sat, Aug 15, 2015 at 5:49 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > PEP 495 [1] is a deliberately minimalistic proposal to remove an > ambiguity in representing some local times as datetime.datetime > objects. > > The concept has been extensively discussed on python-ideas and this > mailing list. I believe a consensus has been reached and it is > reflected in the PEP. > > PEP 495 does not propose any changes to datetime/timedelta > arithmetics, but it has been agreed that it is a necessary step for > implementing the "strict" rules. > > [1]: https://www.python.org/dev/peps/pep-0495 > _______________________________________________ > Datetime-SIG mailing list > Datetime-SIG at python.org > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Aug 17 18:58:30 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 17 Aug 2015 09:58:30 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: On Sun, Aug 16, 2015 at 7:29 PM, Guido van Rossum wrote: > > How did we end up bending over this far backwards for leap seconds? > Are we bending over backwards at all for leap seconds? I don't see where -- at least not in this PEP. But as the un-official leap-seconds nudge on this list -- I don't think we should bend over backwards to support them. I just think if we can leave open the door to a future implementation without any bending, then we should. > To me, we're talking about a mapping to POSIX timestamps > indeed -- and if you are mapping to POSIX timestamps, it should certainly match what POSIX specifies. > If you care about leap seconds you should use a different time source, and > you shouldn't be using either the time module or the datetime module. > well, not as datetime is currently implemented, anyway, sure. > > > And while I'm at it, I don't think PEP 500 is the answer. > I know I really don't like the idea of delegating everything to the tzinfo object, it simply doesn't seem to be the right place for things other than, timezone info / operations. If you really want the number of real-world seconds between two datetime > values you can write your own difftime() function that consults a leap > seconds database. > > If I get around to it, I'd like to try a datetime subclass (or duck-typed work-alike) that does leap seconds (with a table, of course). It think it should be doable such that it can inter-opt with the existing timedelta classes and tzinfo objects. Maybe I'm totally wrong, but I guess I'll find out if/when I get around to it. > As for how to request timeline arithmetic vs. the other kind ("human"? I > forget where our glossary is), that could be done by special-casing in the > datetime class using some property (yet to be determined) of the tzinfo > subclass or instance; or it could be done using different timedelta-ish > classes. PEP 500 seems overly general just so it can also support the leap > second case. > I totally agree -- datetime is where the "how to handle tzinfo for computing deltas" logic is, so it makes sense to keep it there. A flag for what kind of arithmetic you want should be able to handle it. but that's another PEP. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Mon Aug 17 19:12:48 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 17 Aug 2015 10:12:48 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: On Sun, Aug 16, 2015 at 8:18 PM, Guido van Rossum wrote: > > But nobody would be able to remember that mnemonic -- the far majority of > people simply don't know whether to move the clock forward or back when DST > begins or ends > Somehow I still remember the "spring forward : fall back" mnemonic I learned it grad school -- but I do'nt think that helps us here.... > - I'm disappointed that there are now some APIs that explicitly treat a > > naive datetime as local (using the system timezone). indeed -- but as long as it's isolated to this one corner of the API, I guess we're OK. BTW, numpy's datetime assumes the locale timezone when parsing ISO 8601 strings without an offset -- and it's a disaster. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Mon Aug 17 19:15:08 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 17 Aug 2015 13:15:08 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: On Mon, Aug 17, 2015 at 12:44 PM, Chris Barker wrote: > I _think_ I recall form this discussion that this was done, rather than an > error, because it was decided that such calls should never raise an > exception (also, the replace() functionality makes it all too likely). So > it's probably best to do it in the proposed way, but the PEP should make it > clear that a prime motivation is to avoid Exceptions in such code. There are many reasons why the proposed behavior is a good one, but raising an exception from .timestamp() is not really a contender. We cannot make currently valid programs crash and there are many ways a missing time can be produced by a valid program even without an explicit .replace() call. For example, adding timedelta(1) to a valid datetime can produce a missing datetime. It is also common in some algorithms to reattach tzinfo without checking the other datetime components. For example, stdlib's own datetime.tzinfo.fromutc() does that. I am going to add an explanation of how to implement code that rejects missing values to the PEP and this will probably address your concern. (The code is just if dt.replace(first=True).timestamp() > dt.replace(first=False).timestamp(): raise MissingTimeError.) From chris.barker at noaa.gov Mon Aug 17 19:19:47 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 17 Aug 2015 10:19:47 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: On Mon, Aug 17, 2015 at 10:15 AM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > but > raising an exception from .timestamp() is not really a contender. exactly -- good to make that clear in the PEP. > I am going to add an explanation of how to implement code that rejects > missing values to the PEP and this will probably address your concern. > > thanks -- perfect. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon Aug 17 19:20:57 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 17 Aug 2015 10:20:57 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: <55D217F9.3020108@stoneleaf.us> On 08/17/2015 09:58 AM, Chris Barker wrote: > I know I really don't like the idea of delegating everything to the > tzinfo object, it simply doesn't seem to be the right place for things > other than, timezone info / operations. The way I look at it is the final object is a merging of a datetime and tzinfo. The actual logic of how to add/subtract, etc., is built in to the datetime itself and the switch is built into the tzinfo -- so classic tzinfo (aka no switch present) gets the current behavior, while the new "strict" tzinfo (with switch present) would instigate the other behavior. Off the cuff I would say have the datetime object check for certain methods in the tzinfo (such as add_datetime, add_timedelta, etc), and if present use them, otherwise use the normal methods. If performance becomes an issue we could have a different base class for strict datetimes, and the act of adding a tzinfo can change the class of the resulting datetime object to that strict subclass. -- ~Ethan~ From carl at oddbird.net Mon Aug 17 19:33:45 2015 From: carl at oddbird.net (Carl Meyer) Date: Mon, 17 Aug 2015 11:33:45 -0600 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: <55D21AF9.8060102@oddbird.net> On 08/17/2015 10:58 AM, Chris Barker wrote: > On Sun, Aug 16, 2015 at 7:29 PM, Guido van Rossum And while I'm at it, I don't think PEP 500 is the answer. > > > I know I really don't like the idea of delegating everything to the > tzinfo object, it simply doesn't seem to be the right place for things > other than, timezone info / operations. Datetime arithmetic with a timezone-aware datetime _is_ a "timezone operation". Doing it correctly requires knowledge of timezone transitions. I think PEP 500 is an elegant and flexible solution, and the alternatives discussed so far (e.g. hardcoding isinstance checks) are much less flexible for users of the datetime module, without compensating benefit. I haven't seen anyone yet present a specific downside to the PEP 500 approach. But that's all off-topic for this thread, which is about PEP 495, not PEP 500. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Mon Aug 17 19:55:18 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 17 Aug 2015 13:55:18 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: <55D217F9.3020108@stoneleaf.us> References: <55D217F9.3020108@stoneleaf.us> Message-ID: On Mon, Aug 17, 2015 at 1:20 PM, Ethan Furman wrote: > Off the cuff I would say have the datetime object check for certain methods > in the tzinfo (such as add_datetime, add_timedelta, etc), and if present use > them, otherwise use the normal methods. Isn't this exactly what PEP-0500 says? From ethan at stoneleaf.us Mon Aug 17 20:05:55 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 17 Aug 2015 11:05:55 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: <55D217F9.3020108@stoneleaf.us> Message-ID: <55D22283.9060406@stoneleaf.us> On 08/17/2015 10:55 AM, Alexander Belopolsky wrote: > On Mon, Aug 17, 2015 at 1:20 PM, Ethan Furman wrote: >> Off the cuff I would say have the datetime object check for certain methods >> in the tzinfo (such as add_datetime, add_timedelta, etc), and if present use >> them, otherwise use the normal methods. > > Isn't this exactly what PEP-0500 says? Is it? I haven't read it yet . If so, I'm probably in favor! ;) -- ~Ethan~ From chris.barker at noaa.gov Mon Aug 17 20:15:23 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Mon, 17 Aug 2015 11:15:23 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: <55D22283.9060406@stoneleaf.us> References: <55D217F9.3020108@stoneleaf.us> <55D22283.9060406@stoneleaf.us> Message-ID: Could we move PEP 500 discussion to its own thread? Or re-use the one started when Alexander posted the PEP. -Chris On Mon, Aug 17, 2015 at 11:05 AM, Ethan Furman wrote: > On 08/17/2015 10:55 AM, Alexander Belopolsky wrote: > >> On Mon, Aug 17, 2015 at 1:20 PM, Ethan Furman wrote: >> > > Off the cuff I would say have the datetime object check for certain methods >>> in the tzinfo (such as add_datetime, add_timedelta, etc), and if present >>> use >>> them, otherwise use the normal methods. >>> >> >> Isn't this exactly what PEP-0500 says? >> > > Is it? I haven't read it yet . > > If so, I'm probably in favor! ;) > > -- > ~Ethan~ > > _______________________________________________ > Datetime-SIG mailing list > Datetime-SIG at python.org > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon Aug 17 20:51:54 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 17 Aug 2015 11:51:54 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: <55D217F9.3020108@stoneleaf.us> <55D22283.9060406@stoneleaf.us> Message-ID: <55D22D4A.2000304@stoneleaf.us> On 08/17/2015 11:15 AM, Chris Barker wrote: > Could we move PEP 500 discussion to its own thread? > > Or re-use the one started when Alexander posted the PEP. Certainly, and thanks for the reminder. My original point was, however, to answer your objection about having the logic in the tzinfo. -- ~Ethan~ From guido at python.org Mon Aug 17 21:05:22 2015 From: guido at python.org (Guido van Rossum) Date: Mon, 17 Aug 2015 12:05:22 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: On Sun, Aug 16, 2015 at 9:10 PM, Tim Peters wrote: > [Guido] > > How did we end up bending over this far backwards for leap seconds? > > Eh - I don't see that we are. The `first` flag is applicable to any > source of folds and gaps, provided the folds aren't worse than 2-to-1. > Leap seconds are just one more case of that. > That was in reference to PEP 500 more than to the `first` flag. > If you want to build in gap-and-fold-aware arithmetic, then it seems > only one kind can be supported. If it's left to tzinfo implementers, > then it's up to them. > Well, most tzinfo implementers have other priorities; leap-secondists, while loud, are rare enough that the extra freedom just complexificates the API for everyone else without reason. > > To me, we're talking about a mapping to POSIX timestamps, which use a > > straightforward algorithm to map compute the date and time -- in > particular, > > divmod(ts, 86400) will give the day number and the second within that > day. > > The day gets converted to a date using standard calendar math (assuming > they > > eventually fix the standard so that 2100 is not considered a leap year > :-) > > and the time gets converted to HH:MM:SS using even simpler calculations. > > There's no room for leap seconds there. > > Nobody is proposing to change any of that. A POSIX timestamp maps to > the same UTC time in any case (including in cases where a POSIX > timestamp is ambiguous, and including cases that produce a UTC time > that never existed (hasn't happened yet, but will when a leap second > gets removed someday)). > > None of that _precludes_ implementing an arithmetic that returns > correct real-life deltas (between real-life UTC times) as SI-second > durations. That can be done in most (but not all) cases using POSIX > timestamps by consulting a table of leap second adjustments. It > _could_ be done in all cases (not just most) using first-aware > datetimes, because - unlike bare POSIX timestamps - the `first` flag > gives datetimes a calendar notation that disambiguates the ambiguous > POSIX timestamps. Do note that POSIX supports a calendar notation for > leap seconds too (allowing tm_sec to be 60). > But datetime does not. IMO a seconds value of 60 is more likely a bug than an attempt to represent a leap second. > _Given_ a `first`-like flag, nothing beyond that is really required > from Python (although it would be more useful if a way were given to > map a second value of "60" to/from first=False when applicable). > Using the `first` flag sounds like a good compromise then. (Unless leap seconds may happen at a time when some jurisdiction also changes DST?) > BTW, POSIX fixed the "2100 is not a leap year" problem in 2001 - but > ancient Internet rants never die ;-) > Good to know. :-) > > It's important to me that if two different Python implementations, > running > > on two different computers, convert a POSIX timestamp to a date+time they > > get the same result. > > That's important to everyone. It would remain true even if someone > did write an implementation of leap-second-aware arithmetic. > > > > This is *much* more important to me than the idea that if two computers > > simultaneously call time.time() they get the same value -- there is > simply > > no such thing as "simultaneously" (imagine one of the computers is on > > a rocket traveling to the moon). > > One factoid I learned recently: the best atomic clocks now are so > bloody sensitive that they run at detectably different rates if their > altitude changes by an inch (due to gravitational time dilation). > > Luckily, nobody yet has demanded Python support relativistic datetime > conversions ;-) > > > > If you care about leap seconds you should use a different time source, > and > > you shouldn't be using either the time module or the datetime module. > They > > are inextricably linked. > > Eh - it's a shallow problem. Just tedious. Adding `first` is a > crucial part of the battle for _all_ kinds of gap-and-fold-aware > arithmetic. And the code for all kinds of the latter is a tedious but > conceptually trivial chore. Accounting for clocks jumping around is > no harder "theoretically" when the jump is caused by a leap second > than when it's caused by daylight time starting or ending. > > > > So there's my answer to #1. You may consider this a Pronouncement if you > > wish. It should not come as any surprise. > > You don't want leap-aware arithmetic in the core. Neither do I. > > The remaining question is whether you want to make it impossible for > someone else to add it. > That depends on the cost for everyone else. If the cost is PEP 500, that's too high a price. > > And while I'm at it, I don't think PEP 500 is the answer. If you really > want > > the number of real-world seconds between two datetime values you can > write > > your own difftime() function that consults a leap seconds database. > > That's a start. Related questions include "the nuclear reactor has to > be vented exactly 3600 SI seconds from now - what will the local clock > say then?". > They should probably use their own clock hardware and software rather than use time.time(). > I agree such apps "should be" using TAI. But the one actual scientist > who has posted here most often says they don't get a choice about the > time system used by the data they need to analyze. I say they should > convert all the data they're given to TAI. They don't want to hear > that ;-) > But what does their data look like? And what time source originally generated it (and how does that time source handle leap seconds)? It also makes a big difference what kind of intervals they are looking at. > > As for how to request timeline arithmetic vs. the other kind ("human"? I > > forget where our glossary is), that could be done by special-casing in > the > > datetime class using some property (yet to be determined) of the tzinfo > > subclass or instance; or it could be done using different timedelta-ish > > classes. PEP 500 seems overly general just so it can also support the > leap > > second case. > > I've mentioned this before: I think it's insane to try to implement > "human arithmetic" by overloading arithmetic operators. See the > iCalendar spec for the least people expect now. Things like "the > first Tuesday after the first Monday in November every 4 years" (US > presidential election dates) are just the start. Trying to spell all > that with combinations of +-*/% would be a write-only nightmare. > We don't have to spell all of that using arithmetic operators. But we can write it all using simple-minded functions on top of operators that behave intuitively. E.g. the first Wednesday in November could be determined by taking November 1 and then doing something like "while it's not Wednesday, add a day". > dateutil implements the full iCalendar RRULE spec, and sanely uses > functions with lots of keyword arguments. Nobody is _really_ going to > improve on that. > > So I don't see PEP 500 as aiming at human arithmetic at all (although > others may). I do see it as a way to: > > 1. Keep _all_ gap-and-fold timeline arithmetics out of the core. I > don''t really want, e.g., DST-aware timeline arithmetic in the core > either. > > 2. Let tzinfo implementers decide which kinds of gaps and folds they care > about. > > > > ("So how do I write a real-time stock trading system", you may ask. Good > > question. Ask the stock exchanges. Their solution was not to trade near > the > > leap second. Given that they probably have to deal with a mix of > languages > > including at least Java, Cobol, Lisp, Python, and Smalltalk, I'm doubtful > > that they'll do better during the lifetime of Python 3. Famous last words > > perhaps.) > > Most shut down for at least an hour around the last leap second, > because a leap second gets inserted just as the hour (on clocks > running at a multiple-of-hour offset from UTC) is about to change. > Experience taught them last time around that software confusions > quickly propagate across fields :-( > > But that's a different problem. Python isn't the _source_ of anyone's > notion of time. CPython inherits all it can know about "what time is > it now?" from the OS and platform C libraries. > Sure. And? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Mon Aug 17 22:46:18 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 17 Aug 2015 16:46:18 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... Message-ID: On Mon, Aug 17, 2015 at 3:05 PM, Guido van Rossum wrote: > On Sun, Aug 16, 2015 at 9:10 PM, Tim Peters wrote: .. >> Eh - I don't see that we are. The `first` flag is applicable to any >> source of folds and gaps, provided the folds aren't worse than 2-to-1. >> Leap seconds are just one more case of that. > > > That was in reference to PEP 500 more than to the `first` flag. Changing the subject accordingly. > >> >> If you want to build in gap-and-fold-aware arithmetic, then it seems >> only one kind can be supported. If it's left to tzinfo implementers, >> then it's up to them. > > > Well, most tzinfo implementers have other priorities; leap-secondists, while > loud, are rare enough that the extra freedom just complexificates the API > for everyone else without reason. > I don't see how PEP-0500 affects anyone other than implementers of exotic tzinfo implementations. Regular users will just need to pick between subclassing datetime.tzinfo and get "human" arithmetic within their timezone or subclass tzstrict to get "strict" arithmetic. Most user will not even need that because their vendor of tzinfo implementations will have made the choice for them. .. >> The remaining question is whether you want to make it impossible for >> someone else to add it. > > > That depends on the cost for everyone else. If the cost is PEP 500, that's > too high a price. > It is unlikely that leap seconds will disappear from the definition of UTC. It is likely that something like Google's "leap smear" will become the best practice to be followed for years (if not decades) to come. Doing real-time calculations on the data recorded by a computer with a leap-smeared time source is even harder than on the data with true UTC timestamps. Nevertheless, PEP-0500 will enable someone to implement a "leap_smear_tzinfo" class and have datetime arithmetics that knows that some 1000 seconds in a given day are more equal than others. >> >> > And while I'm at it, I don't think PEP 500 is the answer. If you really >> > want >> > the number of real-world seconds between two datetime values you can >> > write >> > your own difftime() function that consults a leap seconds database. >> >> That's a start. Related questions include "the nuclear reactor has to >> be vented exactly 3600 SI seconds from now - what will the local clock >> say then?". > > > They should probably use their own clock hardware and software rather than > use time.time(). > PEP-0500 is not about time sources or time counters. It is the answer to a seemingly simple question: how many seconds have passed from T12:00 to T12:00. The answer depends on your definition of 12:00. If you define 12:00 as the time when the gnomon shadow passes the noon tick on the city clock tower, then the answer will be different every day. If you use a "coordinated" time scale, then the answer will be different every few years. If you are sophisticated enough that your time source delivers meaningful nanoseconds - you probably need to consult the tide table for your area to to know the true distance between your time source ticks. Nevertheless, if the recorded timestamps appear in your data in ISO 8601 - like format, someone could write a PEP-0500 tzinfo implementation that would let you manipulate your timestamps as any other datetime instances. >> >> I agree such apps "should be" using TAI. But the one actual scientist >> who has posted here most often says they don't get a choice about the >> time system used by the data they need to analyze. I say they should >> convert all the data they're given to TAI. They don't want to hear >> that ;-) > > > But what does their data look like? And what time source originally > generated it (and how does that time source handle leap seconds)? It also > makes a big difference what kind of intervals they are looking at. > Suppose you are the market regulator looking at the trades from two traders: Trader A Trader B 23:59:59.01 23:59:59.30 23:59:59.40 23:59:59.45 23:59:60.00 23:59:59.40 Did trader B place his last trade before trader A did? A $100M lawsuit outcome depends on your answer! There is probably enough information in the evidence to conclude that Trader A printed UTC timestamps and Trader B was POSIX compliant and repeated the 23:59:59 second. You can get their ntpd logs and interview their system administrators, but at the end of the day - wouldn't it be nice to use Python to analyze all that data and win your case with a simple dta.replace(tzinfo=TraderA) < dtb.replace(tzinfo=TraderB)? From guido at python.org Mon Aug 17 23:16:20 2015 From: guido at python.org (Guido van Rossum) Date: Mon, 17 Aug 2015 14:16:20 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: Message-ID: On Mon, Aug 17, 2015 at 1:46 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > On Mon, Aug 17, 2015 at 3:05 PM, Guido van Rossum > wrote: > > On Sun, Aug 16, 2015 at 9:10 PM, Tim Peters > wrote: > .. > >> Eh - I don't see that we are. The `first` flag is applicable to any > >> source of folds and gaps, provided the folds aren't worse than 2-to-1. > >> Leap seconds are just one more case of that. > > > > > > That was in reference to PEP 500 more than to the `first` flag. > > Changing the subject accordingly. > Fine. > > > >> > >> If you want to build in gap-and-fold-aware arithmetic, then it seems > >> only one kind can be supported. If it's left to tzinfo implementers, > >> then it's up to them. > > > > > > Well, most tzinfo implementers have other priorities; leap-secondists, > while > > loud, are rare enough that the extra freedom just complexificates the API > > for everyone else without reason. > > > > I don't see how PEP-0500 affects anyone other than implementers of > exotic tzinfo implementations. Regular users will just need to pick > between subclassing datetime.tzinfo and get "human" arithmetic within > their timezone or subclass tzstrict to get "strict" arithmetic. Most > user will not even need that because their vendor of tzinfo > implementations will have made the choice for them. > Well, everyone implementing a tzinfo class will be confronted with the question whether to provide those special methods or not. And they may well be copy/pasting code that implements them. So my claim is that this makes the life of everyone implementing a tzinfo a little more complex, not just that of tzinfo implementers who actually need this protocol. Just like the mere existence of __length_hint__ serves as a distraction for anyone implementing an iterator. > > .. > >> The remaining question is whether you want to make it impossible for > >> someone else to add it. > > > > > > That depends on the cost for everyone else. If the cost is PEP 500, > that's > > too high a price. > > > > It is unlikely that leap seconds will disappear from the definition of > UTC. It is likely that something like Google's "leap smear" will > become the best practice to be followed for years (if not decades) to > come. Why not centuries? FWIW, Dropbox did something similar for the most recent leap second. > Doing real-time calculations on the data recorded by a computer > with a leap-smeared time source is even harder than on the data with > true UTC timestamps. Nevertheless, PEP-0500 will enable someone to > implement a "leap_smear_tzinfo" class and have datetime arithmetics > that knows that some 1000 seconds in a given day are more equal than > others. > Let's put this in perspective though. A second per day is close to an astonishing 0.001% accuracy. Measuring time can be done with insane accuracy compared to most other physical quantities, and for some scientists, that accuracy is not enough. But I still think the effort spent discussing it is out of proportion. Instead of PEP 500, people who care can write their own library. Being able to redefine the meaning of a+b to support such an esoteric case seems unnecessary -- they can just write their own add(a, b) function. > > >> > >> > And while I'm at it, I don't think PEP 500 is the answer. If you > really > >> > want > >> > the number of real-world seconds between two datetime values you can > >> > write > >> > your own difftime() function that consults a leap seconds database. > >> > >> That's a start. Related questions include "the nuclear reactor has to > >> be vented exactly 3600 SI seconds from now - what will the local clock > >> say then?". > > > > > > They should probably use their own clock hardware and software rather > than > > use time.time(). > > > > PEP-0500 is not about time sources or time counters. It is the answer > to a seemingly simple question: how many seconds have passed from > T12:00 to T12:00. The answer depends on your definition of > 12:00. If you define 12:00 as the time when the gnomon shadow passes > the noon tick on the city clock tower, then the answer will be > different every day. If you use a "coordinated" time scale, then the > answer will be different every few years. If you are sophisticated > enough that your time source delivers meaningful nanoseconds - you > probably need to consult the tide table for your area to to know the > true distance between your time source ticks. Nevertheless, if the > recorded timestamps appear in your data in ISO 8601 - like format, > someone could write a PEP-0500 tzinfo implementation that would let > you manipulate your timestamps as any other datetime instances. > I don't buy that it's necessary to be able to use the + operator for this. I *may* accept that it's necessary to be able to parse ISO 8601 (-like) formats that include leap seconds. And I may even accept that a datetime instance with tzinfo is a better return value of such a parse call than a tuple. But I still believe that the use case is esoteric enough that the parsing method should be given a special flag to allow seconds==60. > > >> > >> I agree such apps "should be" using TAI. But the one actual scientist > >> who has posted here most often says they don't get a choice about the > >> time system used by the data they need to analyze. I say they should > >> convert all the data they're given to TAI. They don't want to hear > >> that ;-) > > > > > > But what does their data look like? And what time source originally > > generated it (and how does that time source handle leap seconds)? It also > > makes a big difference what kind of intervals they are looking at. > > > > Suppose you are the market regulator looking at the trades from two > traders: > > Trader A Trader B > 23:59:59.01 23:59:59.30 > 23:59:59.40 23:59:59.45 > 23:59:60.00 23:59:59.40 > > Did trader B place his last trade before trader A did? A $100M > lawsuit outcome depends on your answer! There is probably enough > information in the evidence to conclude that Trader A printed UTC > timestamps and Trader B was POSIX compliant and repeated the 23:59:59 > second. You can get their ntpd logs and interview their system > administrators, but at the end of the day - wouldn't it be nice to use > Python to analyze all that data and win your case with a simple > dta.replace(tzinfo=TraderA) < dtb.replace(tzinfo=TraderB)? > That's a straw man, right? The stock markets closed around the leap second because they can't deal with this. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Tue Aug 18 00:05:24 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 17 Aug 2015 17:05:24 -0500 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: [Guido] >>> How did we end up bending over this far backwards for leap seconds? [Tim] >> Eh - I don't see that we are. The `first` flag is applicable to any >> source of folds and gaps, provided the folds aren't worse than 2-to-1. >> Leap seconds are just one more case of that. [Guido] > That was in reference to PEP 500 more than to the `first` flag. Sure, except I read it as being entirely about PEP 500 (since, if leap seconds had never existed, nothing in PEP 495 would be any different). I just don't see any "bending over this far backwards" in PEP 500. It's a simple protocol allowing for implementers of local clocks to implement arithmetic (and datetime<->string too, if they need it) reflecting how their local clocks work. It's not complex. The hardest part for _anyone_ will be implementing .utcoffset() and .dst() to reflect all the goofy local rules. But even then, that's not complexity for _users_, it's complexity for a relative handful of tzinfo authors, and there's no possible way to hide the arbitrary lumpiness of local rules from the people responsible for _implementing_ arbitrarily lumpy local rules. > ... > Well, most tzinfo implementers have other priorities; They're not required to implement leap seconds. Only someone who _wants_ leap second support has to give a moment's thought to it. > leap-secondists, while loud, are rare enough that the extra freedom just > complexificates the API for everyone else without reason. Can only repeat I don't see undue complexities here. Did you notice that PEP 500 already contains a more-or-less complete implementation of "Olson-aware" timeline arithmetic (i.e., accounting for both DST transitions and changes to base UTC offset)? It's straightforward, not complex (but again assuming .utcoffset() and .dst() have been implemented to meet PEP 495: everyone is wishing away ;-) the code needed to implement "the hard part"). And "everyone else" in this context means just the relative handful of people in the world motivated enough to wrap the Olson database and/or POSIX TZ strings and/or Microsoft registry and/or ... timezone sources in tzinfo subclass objects. You override the methods you need, and leave the rest alone. >> ... >> Do note that POSIX supports a calendar notation for >> leap seconds too (allowing tm_sec to be 60). > But datetime does not. IMO a seconds value of 60 is more likely a bug than > an attempt to represent a leap second. That's fine, and appropriate. For someone who _does_ care about leap seconds enough to do the work, the PEP 500 protocol allows them to take over the datetime<->string operations to (if they choose) accept and/or produce a string with a "second" value of 60 in case of an inserted leap second. This doesn't burden anyone else, or make second=60 an uncaught error in anyone else's tzinfo implementation. People who do use leap seconds are already acutely aware of what second=60 means. >> ... >> _Given_ a `first`-like flag, nothing beyond that is really required >> from Python (although it would be more useful if a way were given to >> map a second value of "60" to/from first=False when applicable). > Using the `first` flag sounds like a good compromise then. Internally I expect that's the only way a leap second would be represented. The "second=60" part is universally understood already by people who care about leap seconds, but their expectations about that begin and end with the strings they type and see. > (Unless leap seconds may happen at a time when some > jurisdiction also changes DST?) If the definition of UTC doesn't change, we'll _eventually_ need to add at least one leap second every minute ;-) But for now, they're only inserted at the end of a June or December, which for physical reasons are far from any known (by me ;-) ) DST transition points. I'm more concerned that they'd coincide with a base-offset-from-UTC transition, since "new year, new timezone!" sounds like something a politician would like :-( In any case, even if they do coincide, it's the same as when a DST transition coincides with a base-UTC-offset change: it's a potential headache for the tzinfo implementer, but no problem for anything in PEPs 495 or 500 _provided that_ folds remain no worse than 2-to-1 (i.e., that a single bit suffices to resolve ambiguities). > ... > That depends on the cost for everyone else. If the cost is PEP 500, that's > too high a price. What changed? Not long ago you said you liked the tzstrict idea best. But seemingly out of nowhere you seem to hate it now. I don't get it. To me it still looks simple and straightforward. > ... > They should probably use their own clock hardware and software rather than > use time.time(). > ... > But what does their data look like? And what time source originally > generated it (and how does that time source handle leap seconds)? It also > makes a big difference what kind of intervals they are looking at. Would answers to those questions make any real difference to you? If so, you can dig through all Chris Barker's posts here, or invite him to write them again ;-) The answers don't really matter to me, just because it's a fact about the world that UTC is a compromise between UT1 and TAI. It was very deliberate that a UTC second is exactly 1 SI second always, so scientists relying on local civil time (defined in turn as an offset from UTC) to record points in time, and computing deltas later, are just doing something UTC guarantees "will work as expected". It's not scientists' fault that computer nerds refuse to support a fundamental worldwide time standard that happens to be inconvenient for computer nerds ;-) [human arithmetic] > We don't have to spell all of that using arithmetic operators. But we can > write it all using simple-minded functions on top of operators that behave > intuitively. E.g. the first Wednesday in November could be determined by > taking November 1 and then doing something like "while it's not Wednesday, > add a day". I strongly encourage everyone interested in "human arithmetic" to look at what dateutil already does. It does supply a timedelta-ish "relative delta" type for doing simple period arithmetic, but goes far beyond that. https://labix.org/python-dateutil That should be in a different thread, but far as I'm concerned we should just ask Gustavo to fold it into the core and declare success :-) >>> ("So how do I write a real-time stock trading system", you may ask... >> Most shut down for at least an hour around the last leap second ... >> But that's a different problem ... > Sure. And? It was just a polite way of saying "Sure. And?" to you ;-) From guido at python.org Tue Aug 18 01:11:59 2015 From: guido at python.org (Guido van Rossum) Date: Mon, 17 Aug 2015 16:11:59 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: Sorry, I don't have time to split any more hairs about this. I just hate leap seconds and anything that mentions it. If PEP 500 exists solely to support leap seconds, it's dead. If it exists to differentiate between "human" and "timeline" arithmetic, I would prefer a single bit (perhaps represented as a subclass test) over PDDM. But that discussion can wait until the PEP-500 thread. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Tue Aug 18 03:22:18 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 17 Aug 2015 20:22:18 -0500 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: Message-ID: [Guido] > Well, everyone implementing a tzinfo class will be confronted with the > question whether to provide those special methods or not. And they may well > be copy/pasting code that implements them. So my claim is that this makes > the life of everyone implementing a tzinfo a little more complex, not just > that of tzinfo implementers who actually need this protocol. Just like the > mere existence of __length_hint__ serves as a distraction for anyone > implementing an iterator. Seriously: who writes tzinfo classes? I do, but I don't personally give a rip about any form of timeline arithmetic. Who writes iterators? A better question for _that_ is "who doesn't?" ;-) AFAICT, only two people in over a dozen years have been serious tzinfo authors, and every serious I project I ever heard of uses one of their two packages: - Stuart Bishop, whose pytz wraps the Olson database, plus endures enormous pain to disambiguate folds. - Gustavo Niemeyer, whose dateutil wraps all of (at least) the Olson database, POSIX TZ strings, Windows registry timezones, and iCalendar-style VTIMEZONE files. Am I missing any? If not, who are we concerned about? I'm pretty sure Stuart and Gustavo have amply demonstrated they're capable of dealing with things dozens of times more complex than anything discussed here. Nobody else cares: they just grab a package and use it. That's how it should be. If and when canned timezones are supplied with Python, one person in the core may still care - plus, of course, people highly motivated to do things the core doesn't cater to. > ... > That's a straw man, right? The stock markets closed around the leap second > because they can't deal with this. Just because I know everyone is fascinated by this ;-) , do note that there are many financial markets around the world, and they no more agree on what to do than anyone else does. Examples for _this_ round of leap second include: - NASDAQ closed for the day early. - Many Asian markets opened late. - Many global exchanges shut down before the leap second insertion and opened again after the next hour passed. - Brazilian markets added their leap second two days early, on Sunday when Brazil's markets were closed. They remained open across "the real" leap second insertion (which their computers' time services were fiddled to ignore). - Some markets ignored the potential problems entirely and just keep going. I don't know whether any problems occurred this time. "The news" is great at publicizing pre-event hysteria, but poor on following up. From alexander.belopolsky at gmail.com Tue Aug 18 04:12:21 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 17 Aug 2015 22:12:21 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: [Posted on Python-Dev] On Sun, Aug 16, 2015 at 3:23 PM, Guido van Rossum wrote: > I think that a courtesy message to python-dev is appropriate, with a link to > the PEP and an invitation to discuss its merits on datetime-sig. Per Gudo's advise, this is an invitation to join PEP 495 discussion on Datetime-SIG. I you would like to catch-up on the SIG discussion, the archive of this thread starts at . The PEP itself can be found at , but if you would like to follow draft updates as they happen, you can do it on Github at . Even though the PEP is deliberately minimal in scope, there are still a few issues to be ironed out including how to call the disambiguation flag. It is agreed that the name should not refer to DST and should distinguish between two ambiguous times by their chronological order. The working name is "first", but no one particularly likes it including the author of the PEP. Some candidates are discussed in the PEP at , and some more have been suggested that I will add soon. Please direct your responses to . From guido at python.org Tue Aug 18 06:19:33 2015 From: guido at python.org (Guido van Rossum) Date: Mon, 17 Aug 2015 21:19:33 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: Message-ID: So we're writing a PEP for two people? Will they even use it? On Monday, August 17, 2015, Tim Peters wrote: > [Guido] > > Well, everyone implementing a tzinfo class will be confronted with the > > question whether to provide those special methods or not. And they may > well > > be copy/pasting code that implements them. So my claim is that this makes > > the life of everyone implementing a tzinfo a little more complex, not > just > > that of tzinfo implementers who actually need this protocol. Just like > the > > mere existence of __length_hint__ serves as a distraction for anyone > > implementing an iterator. > > Seriously: who writes tzinfo classes? I do, but I don't personally > give a rip about any form of timeline arithmetic. Who writes > iterators? A better question for _that_ is "who doesn't?" ;-) > > AFAICT, only two people in over a dozen years have been serious tzinfo > authors, and every serious I project I ever heard of uses one of their > two packages: > > - Stuart Bishop, whose pytz wraps the Olson database, plus endures > enormous pain to disambiguate folds. > > - Gustavo Niemeyer, whose dateutil wraps all of (at least) the Olson > database, POSIX TZ strings, Windows registry timezones, and > iCalendar-style VTIMEZONE files. > > Am I missing any? > > If not, who are we concerned about? I'm pretty sure Stuart and > Gustavo have amply demonstrated they're capable of dealing with things > dozens of times more complex than anything discussed here. Nobody > else cares: they just grab a package and use it. That's how it > should be. If and when canned timezones are supplied with Python, one > person in the core may still care - plus, of course, people highly > motivated to do things the core doesn't cater to. > > > > ... > > That's a straw man, right? The stock markets closed around the leap > second > > because they can't deal with this. > > Just because I know everyone is fascinated by this ;-) , do note that > there are many financial markets around the world, and they no more > agree on what to do than anyone else does. Examples for _this_ round > of leap second include: > > - NASDAQ closed for the day early. > > - Many Asian markets opened late. > > - Many global exchanges shut down before the leap second insertion and > opened again after the next hour passed. > > - Brazilian markets added their leap second two days early, on Sunday > when Brazil's markets were closed. They remained open across "the > real" leap second insertion (which their computers' time services were > fiddled to ignore). > > - Some markets ignored the potential problems entirely and just keep going. > > I don't know whether any problems occurred this time. "The news" is > great at publicizing pre-event hysteria, but poor on following up. > -- --Guido van Rossum (on iPad) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Tue Aug 18 06:59:54 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 17 Aug 2015 23:59:54 -0500 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: Message-ID: [Guido] > So we're writing a PEP for two people? Probably, but I don't think we know _which_ two people yet. If we also include PEP 500 along its current lines, the number may even jump to four or five ;-) > Will they even use it? I can't speak for Stuart or Gustavo, but _someone_ will. I think in somewhat the same way, e.g., Python added seemingly hyper-general "rich comparisons" for Travis Oliphant, so he could implement element-wise array comparisons in numpy. The number of people who need to know how to implement a thing has nothing to do with how many people end up _using_ the thing - it's no indication of the thing's importance. Dealing with folds and gaps due to DST and base-offset transitions only _needs_ to be done once, so, sure, in that sense PEP 495 alone is being written for one or two people: whoever jumps in first to complete wrapping the Olson database with 495-compliant .utcoffset() and .dst() implementations, and possibly overtaken by someone else who goes on to wrap all other common sources of timezone info too. Once the common sources _are_ all wrapped, who else would bother to duplicate the work? There's no point to it beyond personal amusement. I expect the number of people who truly want to implement their own non-Olson non-Microsoft non-POSIX-TZ non-VTIMEZONE offset-from-POSIX-approximation-to-UTC timezones is exactly zero. Honest: do _you_ believe there are more than two people in the world who are motivated enough to do all that work? It's tedious. I expect that's why, however many people may have started down that path, only two packages finished it. The doesn't mean the packages aren't important. pytz and dateutil are already widely used, despite fighting a design gap PEP 495 aims to fill. My bet is that at least Stuart would love to throw out all the under-the-cover complications pytz added to worm around it. But I don't know; e.g., maybe he's plain burned out on it. But _someone_ will still have - or generate - sufficient enthusiasm :-) From ethan at stoneleaf.us Tue Aug 18 16:39:19 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 18 Aug 2015 07:39:19 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: Message-ID: <55D34397.9030502@stoneleaf.us> On 08/17/2015 02:16 PM, Guido van Rossum wrote: > On Mon, Aug 17, 2015 at 1:46 PM, Alexander Belopolsky wrote: > Well, everyone implementing a tzinfo class will be confronted with the question > whether to provide those special methods or not. And they may well be copy/pasting > code that implements them. So my claim is that this makes the life of everyone > implementing a tzinfo a little more complex, not just that of tzinfo implementers > who actually need this protocol. Just like the mere existence of __length_hint__ > serves as a distraction for anyone implementing an iterator. Having briefly read PEP 500 it seems to me the primary use case is the DST transition; I can easily imagine experiments, studies, laboratory processes, etc., that need to be aware of how many hours have/will have passed, and being off by that one hour (or 30 minutes, or whatever) is simply not acceptable. -- ~Ethan~ From guido at python.org Tue Aug 18 17:07:26 2015 From: guido at python.org (Guido van Rossum) Date: Tue, 18 Aug 2015 08:07:26 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: <55D34397.9030502@stoneleaf.us> References: <55D34397.9030502@stoneleaf.us> Message-ID: On Tue, Aug 18, 2015 at 7:39 AM, Ethan Furman wrote: > On 08/17/2015 02:16 PM, Guido van Rossum wrote: > >> On Mon, Aug 17, 2015 at 1:46 PM, Alexander Belopolsky wrote: >> > > Well, everyone implementing a tzinfo class will be confronted with the >> question >> whether to provide those special methods or not. And they may well be >> copy/pasting >> code that implements them. So my claim is that this makes the life of >> everyone >> implementing a tzinfo a little more complex, not just that of tzinfo >> implementers >> who actually need this protocol. Just like the mere existence of >> __length_hint__ >> serves as a distraction for anyone implementing an iterator. >> > > Having briefly read PEP 500 it seems to me the primary use case is the DST > transition; I can easily imagine experiments, studies, laboratory > processes, etc., that need to be aware of how many hours have/will have > passed, and being off by that one hour (or 30 minutes, or whatever) is > simply not acceptable. > Yes, but there are other options. PEP 500 allows way more freedom than is needed. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Aug 18 17:28:41 2015 From: guido at python.org (Guido van Rossum) Date: Tue, 18 Aug 2015 08:28:41 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: Proposal: name the flag 'fold', reverse it's sense, and be done with it (then move on to PEP 500). So fold defaults to False. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Tue Aug 18 17:36:55 2015 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 18 Aug 2015 10:36:55 -0500 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> Message-ID: [Ethan Furman] >> Having briefly read PEP 500 it seems to me the primary use case is the DST >> transition; I can easily imagine experiments, studies, laboratory processes, >> etc., that need to be aware of how many hours have/will have passed, and >> being off by that one hour (or 30 minutes, or whatever) is simply not >> acceptable. [Guido] > Yes, but there are other options. PEP 500 allows way more freedom than is > needed. Another approach to consider: _nothing_ is needed beyond PEP 495. As PEP 500 shows with concrete code, those who want DST-aware arithmetic can easily get it (after PEP 495-compliant timezone wrappers are available - otherwise this approach fails in a handful of cases) via simple functions. For example (copy/paste/edit from the PEP): def timeline_add(dt, delta): asutc = dt.astimezone(timezone.utc) return (asutc+ delta).astimezone(dt.tzinfo) That's akin to what I've done in the very few cases I wanted timeline arithmetic. It's fine by me that "+" isn't overloaded because I need so little of this. Indeed, it's _because_ I need so little of this that I _prefer_ to use a named function: that makes it dead obvious I'm doing something unusual (for me). Someone could easily convince me _they_ need "a lot" of it. But someone else could easily convince me they need "a lot" of leap-second-aware arithmetic. It's not for me to say one is right and the other is wrong. That's for Guido to say ;-) From alexander.belopolsky at gmail.com Tue Aug 18 17:38:36 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 18 Aug 2015 11:38:36 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> Message-ID: On Tue, Aug 18, 2015 at 11:07 AM, Guido van Rossum wrote: > PEP 500 allows way more freedom than is needed. This is right. With respect to finding elapsed time between two times (x and y) expressed as datetime instances, PEP-0500 allow tzinfo implementor to provide an arbitrary function of two variables: d(x, y), while arguably in most cases d(x, y) = x - f(x) - y + f(y), where f is a function of a single variable. (In terms of current tzinfo interface, f is the utcoffset() function.) In other words, the PEP allows arbitrary stretching and shrinking of the time line, while in most use cases only cutting and shifting is needed. However, I don't see this additional freedom as a big complication. Even in the common case, it may be easier to implement d(x, y) than to figure out f(x). The problem with f(x) is that it is the UTC offset as a function of local time while most TZ database interfaces only provide UTC offset as a function of UTC time. As a result, it is often easier to implement d(x, y) (for example, as d(x, y) = g(x) - g(y)) than to implement f(x). From alexander.belopolsky at gmail.com Tue Aug 18 17:55:30 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 18 Aug 2015 11:55:30 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> Message-ID: On Tue, Aug 18, 2015 at 11:36 AM, Tim Peters wrote: > [Ethan Furman] >>> Having briefly read PEP 500 it seems to me the primary use case is the DST >>> transition; I can easily imagine experiments, studies, laboratory processes, >>> etc., that need to be aware of how many hours have/will have passed, and >>> being off by that one hour (or 30 minutes, or whatever) is simply not >>> acceptable. > > [Guido] >> Yes, but there are other options. PEP 500 allows way more freedom than is >> needed. > > Another approach to consider: _nothing_ is needed beyond PEP 495. +1 - Implementation of PEP 495 in stdlib will keep tzinfo providers busy for quite some time. Hopefully, we will be able to get the implementation in early enough in 3.6 cycle to give them an opportunity to release concurrently with CPython. My main concern with the alternative arithmetic is that while it may be a desired feature for many users, it may complicate implementation of timezone conversions and the inner working of concrete tzinfo subclasses. I think we should give tzinfo providers a chance to digest PEP 495 before we hit them with PEP 500. From alexander.belopolsky at gmail.com Tue Aug 18 18:07:39 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 18 Aug 2015 12:07:39 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: On Tue, Aug 18, 2015 at 11:28 AM, Guido van Rossum wrote: > Proposal: name the flag 'fold', reverse it's sense, and be done with it > (then move on to PEP 500). So fold defaults to False. I like "fold". It is short, uses the word from the problem domain and I think the ascii-art below can serve as the mnemonic: fold=True +---+ \ | \ | fold=False \| fold=False ------------->+----------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Tue Aug 18 18:30:25 2015 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 18 Aug 2015 11:30:25 -0500 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: [Guido] >> Proposal: name the flag 'fold', reverse it's sense, and be done with it >> (then move on to PEP 500). So fold defaults to False. [Alexander] > I like "fold". It is short, uses the word from the problem domain I suppose I'm being dense, but what does fold=True mean? Clearly it means "this is an ambiguous time" - but _which_ one is intended? The earlier or the later? > and I think the ascii-art below can serve as the mnemonic: > > fold=True > +---+ > \ | > \ | > fold=False \| fold=False > ------------->+----------- Same question, alas - I don't see what the diagram is trying to tell me, apart from that there's some kind of 2-to-1 relationship. >From Guido's "reverse its sense" I'm guessing fold=True means "this is the later of two ambiguous times", but there's nothing in the word "fold" on its own to suggest that. It's only a deduction from knowing what first=True meant before (the earlier of two ambiguous times) coupled with "reverse its sense". Or does fold=True mean something else? Related: would people have been happier with the obvious "first" if it had been named "_first" instead? For the most part it's an attribute intended to be set "by magic" by various timezone operations, not something most users need to know anything about (indeed, for most users it would qualify as an "attractive nuisance"). From guido at python.org Tue Aug 18 18:32:07 2015 From: guido at python.org (Guido van Rossum) Date: Tue, 18 Aug 2015 09:32:07 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: That perfectly visualizes how I was thinking of it! On Tue, Aug 18, 2015 at 9:07 AM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > > On Tue, Aug 18, 2015 at 11:28 AM, Guido van Rossum > wrote: > > Proposal: name the flag 'fold', reverse it's sense, and be done with it > > (then move on to PEP 500). So fold defaults to False. > > I like "fold". It is short, uses the word from the problem domain and I > think the ascii-art below can serve as the mnemonic: > > fold=True > +---+ > \ | > \ | > fold=False \| fold=False > ------------->+----------- > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From felipe.nospam.ochoa at gmail.com Tue Aug 18 18:43:06 2015 From: felipe.nospam.ochoa at gmail.com (Felipe Ochoa) Date: Tue, 18 Aug 2015 16:43:06 +0000 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> To me the problem is that in my mind both of the repeated times are part of ?the fold? so `fold=True` intuitively to me means the same as `is_ambiguous=True`. What if we merge the proposals and set the attribute to `last_fold`? Then the timeline looks like: last_fold=True +--+ (time flows counter- \ | clockwise in the \| loop) ---------------->+-----------------> last_fold=False last_fold=False Separately, the glossary can be found at: https://github.com/felipeochoa/dt2/blob/master/glossary.md (where there is also a WIP Duration object) On 18 August 2015 at 12:30, Tim Peters < tim.peters at gmail.com > wrote: [Guido] >> Proposal: name the flag ?fold?, reverse it?s sense, and be done with it >> (then move on to PEP 500). So fold defaults to False. [Alexander] > I like ?fold?. It is short, uses the word from the problem domain I suppose I?m being dense, but what does fold=True mean? Clearly it means ?this is an ambiguous time? - but _which_ one is intended? The earlier or the later? > and I think the ascii-art below can serve as the mnemonic: > > fold=True > +---+ > \ | > \ | > fold=False \| fold=False > ------------->+----------- Same question, alas - I don?t see what the diagram is trying to tell me, apart from that there?s some kind of 2-to-1 relationship. >From Guido?s ?reverse its sense? I?m guessing fold=True means ?this is the later of two ambiguous times?, but there?s nothing in the word ?fold? on its own to suggest that. It?s only a deduction from knowing what first=True meant before (the earlier of two ambiguous times) coupled with ?reverse its sense?. Or does fold=True mean something else? Related: would people have been happier with the obvious ?first? if it had been named ?_first? instead? For the most part it?s an attribute intended to be set ?by magic? by various timezone operations, not something most users need to know anything about (indeed, for most users it would qualify as an ?attractive nuisance?). -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Tue Aug 18 19:09:26 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 18 Aug 2015 13:09:26 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: On Tue, Aug 18, 2015 at 12:30 PM, Tim Peters wrote: > [Guido] > >> Proposal: name the flag 'fold', reverse it's sense, and be done with it > >> (then move on to PEP 500). So fold defaults to False. > > [Alexander] > > I like "fold". It is short, uses the word from the problem domain > > I suppose I'm being dense, but what does fold=True mean? Clearly it > means "this is an ambiguous time" - but _which_ one is intended? The > earlier or the later? > I had the same objection when Guido first mentioned "fold" as a side remark in his earlier comments. I agree that unlike "first", "fold" requires some additional knowledge to realize that fold=False means the first and fold=True means the second of the two moments. The diagram is the mnemonic, not a proof that fold=True means the second: fold=True +---+ \ | \ | fold=False \| fold=False ------------->+----------- You can draw a reversed diagram and argue for the opposite meaning fold=True +---+ | / | / fold=False|/ fold=False --------->+------------- For some reason, however, the first diagram feels more natural. Note that when it comes to timekeeping, things are very culture-dependent. For example, I was told that for Chinese, future is behind your back because you can see the past, but not the future. Thus, while it seems natural for us that "first" is earlier than the "second", it may as well be the opposite for other people. Another example that gets me confused in English usage is to "advance the time". Does it mean to move to a later or to the earlier time? And how does it match the "in advance" idiom? I think it is actually better to use a less loaded word "fold" and explain that we consider the earlier of the two ambiguous times to be "regular" and the later to be "in the fold" than hope that users understand what "first" means without explanation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Tue Aug 18 19:19:38 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 18 Aug 2015 10:19:38 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> Message-ID: <55D3692A.4030706@stoneleaf.us> On 08/18/2015 09:43 AM, Felipe Ochoa wrote: > To me the problem is that in my mind both of the repeated times are > part of ?the fold? so `fold=True` intuitively to me means the same > as `is_ambiguous=True`. What if we merge the proposals and set the > attribute to `last_fold`? Then the timeline looks like: > > last_fold=True > +--+ (time flows counter- > \ | clockwise in the > \| loop) > ---------------->+-----------------> > last_fold=False last_fold=False I find 'last_fold' to be more confusing than 'fold'. Plus, time is flowing clockwise. Maybe 'repeat'? Then `repeat = 0` is the default, and `repeat = 1` for the repeated time. -- ~Ethan~ From guido at python.org Tue Aug 18 19:22:26 2015 From: guido at python.org (Guido van Rossum) Date: Tue, 18 Aug 2015 10:22:26 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> Message-ID: On Tue, Aug 18, 2015 at 8:38 AM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > On Tue, Aug 18, 2015 at 11:07 AM, Guido van Rossum > wrote: > > PEP 500 allows way more freedom than is needed. > > This is right. With respect to finding elapsed time between two times > (x and y) expressed as datetime instances, PEP-0500 allow tzinfo > implementor to provide an arbitrary function of two variables: d(x, > y), while arguably in most cases d(x, y) = x - f(x) - y + f(y), where > f is a function of a single variable. (In terms of current tzinfo > interface, f is the utcoffset() function.) In other words, the PEP > allows arbitrary stretching and shrinking of the time line, while in > most use cases only cutting and shifting is needed. > > However, I don't see this additional freedom as a big complication. > Even in the common case, it may be easier to implement d(x, y) than to > figure out f(x). The problem with f(x) is that it is the UTC offset > as a function of local time while most TZ database interfaces only > provide UTC offset as a function of UTC time. As a result, it is > often easier to implement d(x, y) (for example, as d(x, y) = g(x) - > g(y)) than to implement f(x). > This discussion sounds overly abstract. ISTM that d(x, y) in timeline arithmetic can be computed as x.timestamp() - y.timestamp(), (and converting to a timedelta). Similar for adding a datetime and a timedelta. Optimizing this should be IMO the only question is how should a datetime object choose between classic arithmetic[1] and timeline arithmetic. My proposal here is to make that a boolean property of the tzinfo object -- we could either use a marker subclass or an attribute whose absence implies classic arithmetic. [1] Classic arithmetic is how datetime arithmetic works today, i.e. if you add 24 hours to 12:00 on the Saturday before a DST transition, the answer will be 12:00 on Sunday (whereas using timeline arithmetic the answer would be 11:00 or 13:00 depending on which DST transition you're talking about). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Tue Aug 18 19:23:07 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 18 Aug 2015 13:23:07 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> Message-ID: On Tue, Aug 18, 2015 at 12:43 PM, Felipe Ochoa < felipe.nospam.ochoa at gmail.com> wrote: > What if we merge the proposals and set the attribute to `last_fold`? Well, "last fold" sounds like "in the last of two or more folds" rather than "the last in a fold". Here is another way of thinking about the "fold": when you move your clock back in the Fall, which hour do you think you had and which the government gave you? I think most people would say they had the first and the second they are borrowing until the spring. So, when your government creates a fold in the local time, it gives you the second hour in addition to the first that you've already had. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Tue Aug 18 19:35:06 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 18 Aug 2015 11:35:06 -0600 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: <55D3692A.4030706@stoneleaf.us> References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> <55D3692A.4030706@stoneleaf.us> Message-ID: <55D36CCA.5070300@oddbird.net> On 08/18/2015 11:19 AM, Ethan Furman wrote: > Maybe 'repeat'? Then `repeat = 0` is the default, and `repeat = 1` for > the repeated time. One problem with both `fold` and `repeat` is that this flag (per the PEP) also influences resolution of "missing" times, in which case there is no fold (there's a gap) and no time is repeated. IMO the strongest candidate might be `later`. It's easy to figure out which time it would mean in all cases, it applies reasonably to both the "fold" and the "gap", and (unlike `first`) it defaults to False. I don't really see the downside mentioned in the PEP (possible confusion with `latter`) as a strong negative. There are plenty of other APIs in the stdlib which could be mis-spelled; why is that a deal-breaker here? Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From guido at python.org Tue Aug 18 19:36:14 2015 From: guido at python.org (Guido van Rossum) Date: Tue, 18 Aug 2015 10:36:14 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> Message-ID: How about this diagram? fold=True | fold=False +---+-----------> \ . \ . fold=False \. ------------->+ That's how I think of what happens when you set the clock an hour back. On Tue, Aug 18, 2015 at 10:23 AM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > On Tue, Aug 18, 2015 at 12:43 PM, Felipe Ochoa < > felipe.nospam.ochoa at gmail.com> wrote: > >> What if we merge the proposals and set the attribute to `last_fold`? > > > Well, "last fold" sounds like "in the last of two or more folds" rather > than "the last in a fold". > > Here is another way of thinking about the "fold": when you move your clock > back in the Fall, which hour do you think you had and which the government > gave you? I think most people would say they had the first and the second > they are borrowing until the spring. So, when your government creates a > fold in the local time, it gives you the second hour in addition to the > first that you've already had. > > _______________________________________________ > Datetime-SIG mailing list > Datetime-SIG at python.org > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Tue Aug 18 20:03:57 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 18 Aug 2015 14:03:57 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> Message-ID: On Tue, Aug 18, 2015 at 1:36 PM, Guido van Rossum wrote: > How about this diagram? > > fold=True | fold=False > +---+-----------> > \ . > \ . > fold=False \. > ------------->+ > > That's how I think of what happens when you set the clock an hour back. This is a traditional way to visualize the end of DST (with the summer time at the bottom and the winter time at the top), but in the context of PEP 495, a slightly different picture is more appropriate. After addition of the disambiguation flag, the set of local times doubles, so instead of one time line, you have two: fold=True o o o o +---+ o o o o o ... o o o o o o o o o o o . . "Gap" . . <- "Fold" | .. V fold=False ----------->+---------- ... ------> o o o +------ Valid times are represented as dashes ("-") on the diagram above and invalid as circles ("o"). Note that in this picture, the current "fold-unaware" timeline is just the bottom line (fold=False), while your picture suggests that fold=False set is somehow discontinuous at the fall-back point, but it is not in the fold-unaware world, and preserving this was one of my design goals for PEP 495. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Tue Aug 18 20:16:17 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 18 Aug 2015 14:16:17 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: <55D36CCA.5070300@oddbird.net> References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> <55D3692A.4030706@stoneleaf.us> <55D36CCA.5070300@oddbird.net> Message-ID: On Tue, Aug 18, 2015 at 1:35 PM, Carl Meyer wrote: > One problem with both `fold` and `repeat` is that this flag (per the > PEP) also influences resolution of "missing" times, in which case there > is no fold (there's a gap) and no time is repeated. > I thought about this and my answer is that a "gap" is a negative "fold", so in the fold you have t.replace(fold=True) - t.replace(fold=False) > 0 and in the gap - the opposite t.replace(fold=True) - t.replace(fold=False) < 0. While admittedly, this is an a posteriori justification, it makes perfect sense to me. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Aug 18 20:20:28 2015 From: guido at python.org (Guido van Rossum) Date: Tue, 18 Aug 2015 11:20:28 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> <55D3692A.4030706@stoneleaf.us> <55D36CCA.5070300@oddbird.net> Message-ID: I find 'later' also acceptable. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Tue Aug 18 20:22:49 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 18 Aug 2015 12:22:49 -0600 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> <55D3692A.4030706@stoneleaf.us> <55D36CCA.5070300@oddbird.net> Message-ID: <55D377F9.2060604@oddbird.net> On 08/18/2015 12:16 PM, Alexander Belopolsky wrote: > > On Tue, Aug 18, 2015 at 1:35 PM, Carl Meyer > wrote: > > One problem with both `fold` and `repeat` is that this flag (per the > PEP) also influences resolution of "missing" times, in which case there > is no fold (there's a gap) and no time is repeated. > > > I thought about this and my answer is that a "gap" is a negative "fold", > so in the fold you have t.replace(fold=True) - t.replace(fold=False) > 0 > and in the gap - the opposite t.replace(fold=True) - > t.replace(fold=False) < 0. While admittedly, this is an a posteriori > justification, it makes perfect sense to me. That's lovely :-) But I still think I could figure out whether to use later=True or later=False more easily than I could figure out whether to use fold=True or fold=False (in both cases really, but especially in the gap case). Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Tue Aug 18 20:41:52 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 18 Aug 2015 14:41:52 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: <55D377F9.2060604@oddbird.net> References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> <55D3692A.4030706@stoneleaf.us> <55D36CCA.5070300@oddbird.net> <55D377F9.2060604@oddbird.net> Message-ID: On Tue, Aug 18, 2015 at 2:22 PM, Carl Meyer wrote: > > I thought about this and my answer is that a "gap" is a negative "fold", > > so in the fold you have t.replace(fold=True) - t.replace(fold=False) > 0 > > and in the gap - the opposite t.replace(fold=True) - > > t.replace(fold=False) < 0. While admittedly, this is an a posteriori > > justification, it makes perfect sense to me. > > That's lovely :-) > > But I still think I could figure out whether to use later=True or > later=False more easily than I could figure out whether to use fold=True > or fold=False (in both cases really, but especially in the gap case). I am not sure Raymod Hettinger is receiving these emails, so I added him to "bcc". It will be good to have an input from people with teaching experience. The problem with "hardcoding" the temporal relationship in the name of the flag is that for a missing time `t` you get a counter-intuitive t.replace(later=True) - t.replace(later=False) < 0. On one hand, this strongly suggests that something is wrong with `t`, but also invites a question why not make this an error? A "gap" is a negative "fold" rule may not be much better in terms of teachability, but it is hard to judge it without an actual teaching experience. PEP link: https://www.python.org/dev/peps/pep-0495 -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Tue Aug 18 20:42:42 2015 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 18 Aug 2015 13:42:42 -0500 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> <55D3692A.4030706@stoneleaf.us> <55D36CCA.5070300@oddbird.net> Message-ID: [Guido] > I find 'later' also acceptable. +1. We went from something ("first") everyone understood at once, to something ("fold") that has so far required 3 competing sets of diagrams and a pile of thought experiments to justify. The latter was not a good sign ;-) But I'd still name it "_later" instead. It's an implementation detail required to make some obscure edge cases work right "by magic" far more than an advertised feature users need to worry about (or even be aware of). tzinfo authors need to be acutely aware of it, but since silence implies assent I take it we all agree now there are only 2 of those ;-) From carl at oddbird.net Tue Aug 18 20:54:11 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 18 Aug 2015 12:54:11 -0600 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> <55D3692A.4030706@stoneleaf.us> <55D36CCA.5070300@oddbird.net> <55D377F9.2060604@oddbird.net> Message-ID: <55D37F53.5090303@oddbird.net> On 08/18/2015 12:41 PM, Alexander Belopolsky wrote: > > On Tue, Aug 18, 2015 at 2:22 PM, Carl Meyer > wrote: [snip] > But I still think I could figure out whether to use later=True or > later=False more easily than I could figure out whether to use fold=True > or fold=False (in both cases really, but especially in the gap case). > > I am not sure Raymod Hettinger is receiving these emails, so I added him > to "bcc". It will be good to have an input from people with teaching > experience. The problem with "hardcoding" the temporal relationship in > the name of the flag is that for a missing time `t` you get a > counter-intuitive t.replace(later=True) - t.replace(later=False) < 0. > On one hand, this strongly suggests that something is wrong with `t`, > but also invites a question why not make this an error? A "gap" is a > negative "fold" rule may not be much better in terms of teachability, > but it is hard to judge it without an actual teaching experience. `t.replace(later=True) - t.replace(later=False) < 0` certainly seems wrong, but why would it be implemented that way? I now see that your PEP text specifies the same "backwards from the plain sense of the flag" behavior with `first` as the flag: "The value returned by dt.timestamp() given a missing dt will be the larger of the two "nice to know" values if dt.first == True and the smaller otherwise." Why not simply flip the sense of that sentence so that `later` always means the later of the two possible resolutions? Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From tim.peters at gmail.com Tue Aug 18 20:58:24 2015 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 18 Aug 2015 13:58:24 -0500 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: [Alexander Belopolsky ] > ... > Note that when it comes to timekeeping, things are very culture-dependent. > For example, I was told that for Chinese, future is behind your back because > you can see the past, but not the future. Thus, while it seems natural for > us that "first" is earlier than the "second", it may as well be the opposite > for other people. Then it's good that "later" doesn't suffer this hypothetical problem ;-) In English there's no possible confusion between what "earlier" and "later" mean when applied to time. > Another example that gets me confused in English usage is > to "advance the time". Does it mean to move to a later or to the earlier > time? And how does it match the "in advance" idiom? I've always had a similar problem when people talk about a time zone being "ahead of" or "behind" GMT (or, later, UTC). Because different people can and do use each term with both meanings. Indeed, that's why, much as it continues to irk you ;-) , the datetime docs talk about the literally senseless (yet universally understood) "west of" and "east of" UTC. > I think it is actually better to use a less loaded word "fold" and explain > that we consider the earlier of the two ambiguous times to be "regular" and > the later to be "in the fold" than hope that users understand what "first" > means without explanation. Arguing that it's good to use a word nobody understands _because_ nobody understands it requires about 6 more pages of argument than that to succeed ;-) But, seriously, I still think it should be "_later" or "_first" instead: 99.9973% of users will never have a need to understand what it means. The datetime and tzinfo implementers will make it work by magic for them in almost all cases. At worst, the 0.0027% of users remaining may get irked by some anal software asking them "you know, the time you specified is ambiguous in your current time zone - did you intend the earlier or later time?". They'll reboot their computer and hope the question goes away ;-) From guido at python.org Tue Aug 18 21:04:07 2015 From: guido at python.org (Guido van Rossum) Date: Tue, 18 Aug 2015 12:04:07 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: Message-ID: On Tue, Aug 18, 2015 at 11:58 AM, Tim Peters wrote: > But, seriously, I still think it should be "_later" or "_first" > instead: 99.9973% of users will never have a need to understand what > it means. The datetime and tzinfo implementers will make it work by > magic for them in almost all cases. At worst, the 0.0027% of users > remaining may get irked by some anal software asking them "you know, > the time you specified is ambiguous in your current time zone - did > you intend the earlier or later time?". They'll reboot their computer > and hope the question goes away ;-) > That's why I proposed 'fold' -- it tells you "you probably don't care about this" without the leading underscore -- underscore smells too much of "internal to implementation" to me, while we really want this to be part of the API, just an obscure part. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Tue Aug 18 21:06:14 2015 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 18 Aug 2015 14:06:14 -0500 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: <55D37F53.5090303@oddbird.net> References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> <55D3692A.4030706@stoneleaf.us> <55D36CCA.5070300@oddbird.net> <55D377F9.2060604@oddbird.net> <55D37F53.5090303@oddbird.net> Message-ID: [Carl Meyer ] > `t.replace(later=True) - t.replace(later=False) < 0` certainly seems > wrong, but why would it be implemented that way? So that from the two values alone it's possible to distinguish among the 3 possibilities: the time is ambiguous, the time is invalid (in a gap), or the time is ordinary - corresponding to which of ">", "<", or "==" obtains. I'd probably be happier requiring, e.g., a new `classify()` tzinfo method to give a direct answer, but that would be a new burden. Tricking other methods into giving another way to distinguish is at worst defensible. From ethan at stoneleaf.us Tue Aug 18 21:09:32 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 18 Aug 2015 12:09:32 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> <55D3692A.4030706@stoneleaf.us> <55D36CCA.5070300@oddbird.net> <55D377F9.2060604@oddbird.net> Message-ID: <55D382EC.8020301@stoneleaf.us> On 08/18/2015 11:41 AM, Alexander Belopolsky wrote: > The problem with "hardcoding" the temporal relationship in the name of > the flag is that for a missing time `t` you get a counter-intuitive > t.replace(later=True) - t.replace(later=False) < 0. I don't understand this. In the PEP it says: > An instance that has first=False in a non-ambiguous case is said to represent > an invalid time (or is invalid for short), but users are not prevented from > creating invalid instances by passing first=False to a constructor or to a > replace() method. and later > The value of "first" will be ignored in all operations except those that > involve conversion between timezones. So why won't `t.replace(_ltdf=True)` be the same value as `t.replace(_ltdf=False)` ? The flag itself would be different, but the flag is not consulted for maths operations, right? -- ~Ethan~ From ethan at stoneleaf.us Tue Aug 18 21:22:40 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 18 Aug 2015 12:22:40 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: <55D382EC.8020301@stoneleaf.us> References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> <55D3692A.4030706@stoneleaf.us> <55D36CCA.5070300@oddbird.net> <55D377F9.2060604@oddbird.net> <55D382EC.8020301@stoneleaf.us> Message-ID: <55D38600.70000@stoneleaf.us> On 08/18/2015 12:09 PM, Ethan Furman wrote: > On 08/18/2015 11:41 AM, Alexander Belopolsky wrote: > >> The problem with "hardcoding" the temporal relationship in the name of >> the flag is that for a missing time `t` you get a counter-intuitive >> t.replace(later=True) - t.replace(later=False) < 0. > > I don't understand this. In the PEP it says: > >> An instance that has first=False in a non-ambiguous case is said to represent >> an invalid time (or is invalid for short), but users are not prevented from >> creating invalid instances by passing first=False to a constructor or to a >> replace() method. > > and later > >> The value of "first" will be ignored in all operations except those that >> involve conversion between timezones. > > So why won't `t.replace(_ltdf=True)` be the same value as `t.replace(_ltdf=False)` ? The flag itself would be different, but the flag is not consulted for maths operations, right? Ah, puzzling through the PEP again I think you just left off the `.timestamp()` from those two pieces. -- ~Ethan~ From carl at oddbird.net Tue Aug 18 21:24:53 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 18 Aug 2015 13:24:53 -0600 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> <55D3692A.4030706@stoneleaf.us> <55D36CCA.5070300@oddbird.net> <55D377F9.2060604@oddbird.net> <55D37F53.5090303@oddbird.net> Message-ID: <55D38685.5080707@oddbird.net> On 08/18/2015 01:06 PM, Tim Peters wrote: > [Carl Meyer ] >> `t.replace(later=True) - t.replace(later=False) < 0` certainly seems >> wrong, but why would it be implemented that way? > > So that from the two values alone it's possible to distinguish among > the 3 possibilities: the time is ambiguous, the time is invalid (in a > gap), or the time is ordinary - corresponding to which of ">", "<", > or "==" obtains. > > I'd probably be happier requiring, e.g., a new `classify()` tzinfo > method to give a direct answer, but that would be a new burden. > Tricking other methods into giving another way to distinguish is at > worst defensible. Yes, I see. In that case I think `later` and `first` are both poor choices for the name of the flag. A comprehensible name that sometimes means the opposite of its plain English meaning is arguably worse than an incomprehensible name :-) I too wish the PEP offered a nicer way to detect ambiguous or missing times than "guess in both directions and see how they differ" -- but I'm not familiar enough with the core datetime timezone APIs to propose what that should be. (I'm afraid that I've been unrecoverably corrupted by too many years of using pytz's API, which requires that you never touch any of the built-in timezone API.) Carl (Btw, Tim, sorry I never replied to your gracious response to my "essay" [rant] about timeline arithmetic a couple weeks ago; I was traveling and didn't find the time. Your response was interesting, and on target primarily in demonstrating how relative "correct" is. Personally I couldn't care less about leap seconds, and my careless misuse of the term "astronomical time" reflected how little I know or care about them, not my interest in them. I still think that ideally the default behavior of datetime arithmetic with tz-aware datetimes should be timeline arithmetic, and naive datetimes should be used when "classic" arithmetic is desired -- but obviously that fails backward-compatibility.) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Tue Aug 18 21:31:40 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 18 Aug 2015 15:31:40 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: <55D382EC.8020301@stoneleaf.us> References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> <55D3692A.4030706@stoneleaf.us> <55D36CCA.5070300@oddbird.net> <55D377F9.2060604@oddbird.net> <55D382EC.8020301@stoneleaf.us> Message-ID: On Tue, Aug 18, 2015 at 3:09 PM, Ethan Furman wrote: > > The problem with "hardcoding" the temporal relationship in the name of >> the flag is that for a missing time `t` you get a counter-intuitive >> t.replace(later=True) - t.replace(later=False) < 0. >> > > .. > >> The value of "first" will be ignored in all operations except those that >> involve conversion between timezones. >> > > So why won't `t.replace(_ltdf=True)` be the same value as > `t.replace(_ltdf=False)` ? The flag itself would be different, but the > flag is not consulted for maths operations, right? I used t.replace(later=True) - t.replace(later=False) < 0 as a shorthand for t.replace(later=True).timestamp() - t.replace(later=False).timestamp() < 0. (The 0 in the r.h.s. instead of timedelta(0) could serve as a hint.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Tue Aug 18 21:31:49 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 18 Aug 2015 13:31:49 -0600 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> <55D3692A.4030706@stoneleaf.us> <55D36CCA.5070300@oddbird.net> Message-ID: <55D38825.9030801@oddbird.net> Hi Tim, On 08/18/2015 12:42 PM, Tim Peters wrote: [snip] > But I'd still name it "_later" instead. It's an implementation detail > required to make some obscure edge cases work right "by magic" far > more than an advertised feature users need to worry about (or even be > aware of). tzinfo authors need to be acutely aware of it, but since > silence implies assent I take it we all agree now there are only 2 of > those ;-) I'm not a tzinfo author. I recently wrote a scheduling system (using datetime and pytz), and I made use of pytz's roughly-equivalent `is_dst` flag in order to detect gaps and folds in a weekly calendar displayed to the user in local time, so as to (more or less) gracefully prevent the scheduling of appointments in a nonexistent hour, even though for visual/layout reasons the nonexistent hour still has to appear on the calendar. [1] So I'm not sure it's true that this is a flag that only tzinfo authors will have use for. That experience is also why I wish PEP 495 had a nicer way to check for ambiguous/missing times. That project did a _ton_ of "construct naive datetime by combining date and time, then convert that naive datetime to a local timezone, raising an error if the resulting local time is missing or ambiguous." That was trivial with the pytz API: if you specify `tz.localize(naive_dt, is_dst=None)` you'll get an exception on an ambiguous or missing result. It sounds less fun with PEP 495 (would just require a wrapper function to do the conversions, so not too bad -- just feels wrong to do all the conversions twice, when internally datetime should be able to know enough to just raise the exception I want right away). Carl [1] Fun sidenote: Google Calendar, last I checked, completely fails to handle this gracefully. If you try to schedule an event in the nonexistent hour, the UI allows you to proceed, and then you get an opaque "Unknown error" message. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From ethan at stoneleaf.us Tue Aug 18 22:06:33 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 18 Aug 2015 13:06:33 -0700 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> <55D3692A.4030706@stoneleaf.us> <55D36CCA.5070300@oddbird.net> <55D377F9.2060604@oddbird.net> <55D382EC.8020301@stoneleaf.us> Message-ID: <55D39049.5010901@stoneleaf.us> On 08/18/2015 12:31 PM, Alexander Belopolsky wrote: > I used > > t.replace(later=True) - t.replace(later=False) < 0 > > as a shorthand for > > t.replace(later=True).timestamp() - t.replace(later=False).timestamp() < 0 > > (The 0 in the r.h.s. instead of timedelta(0) > could serve as a hint.) The result was invalid and confusing code. Please don't do that. -- ~Ethan~ From alexander.belopolsky at gmail.com Tue Aug 18 22:09:33 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 18 Aug 2015 16:09:33 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: <55D38825.9030801@oddbird.net> References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> <55D3692A.4030706@stoneleaf.us> <55D36CCA.5070300@oddbird.net> <55D38825.9030801@oddbird.net> Message-ID: On Tue, Aug 18, 2015 at 3:31 PM, Carl Meyer wrote: > That project did a _ton_ of "construct naive > datetime by combining date and time, then convert that naive datetime to > a local timezone, raising an error if the resulting local time is > missing or ambiguous." That was trivial with the pytz API: if you > specify `tz.localize(naive_dt, is_dst=None)` you'll get an exception on > an ambiguous or missing result. It sounds less fun with PEP 495 (would > just require a wrapper function to do the conversions, so not too bad -- > just feels wrong to do all the conversions twice, when internally > datetime should be able to know enough to just raise the exception I > want right away). > PEP 495 is designed to avoid the exceptions for a reason. When I enter a conference time in my scheduler, all I care is that an event is created, people are invited and I am reminded 15 minutes before it starts. In most cases, I don't care which of the two ambiguous times is picked or even which of the two times around the missing time as log as the scheduler knows how to avoid conflicts and how to display the time to all participants in an unambiguous way. If I schedule an international conference call, I certainly don't want to see: your conference time is ambiguous in Uruguay, please pick another time. Note that the goal was to make naively written software "just work" and produce consistent and defensible results. Not providing any means to schedule in the second part of the fold is defensible: enjoy free time. Scheduling a call for 2:30AM when user enters 1:30AM that does not exist is also defensible. (1 hour and 30 minutes after the midnight on a spring-forward day is 2:30AM.) A stack trace in the server log due to an uncaught exception is not a defensible result. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Tue Aug 18 22:25:52 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 18 Aug 2015 14:25:52 -0600 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> <55D3692A.4030706@stoneleaf.us> <55D36CCA.5070300@oddbird.net> <55D38825.9030801@oddbird.net> Message-ID: <55D394D0.9030506@oddbird.net> On 08/18/2015 02:09 PM, Alexander Belopolsky wrote: > > On Tue, Aug 18, 2015 at 3:31 PM, Carl Meyer > wrote: > > That project did a _ton_ of "construct naive > datetime by combining date and time, then convert that naive datetime to > a local timezone, raising an error if the resulting local time is > missing or ambiguous." That was trivial with the pytz API: if you > specify `tz.localize(naive_dt, is_dst=None)` you'll get an exception on > an ambiguous or missing result. It sounds less fun with PEP 495 (would > just require a wrapper function to do the conversions, so not too bad -- > just feels wrong to do all the conversions twice, when internally > datetime should be able to know enough to just raise the exception I > want right away). > > > PEP 495 is designed to avoid the exceptions for a reason. When I enter > a conference time in my scheduler, all I care is that an event is > created, people are invited and I am reminded 15 minutes before it > starts. In most cases, I don't care which of the two ambiguous times is > picked or even which of the two times around the missing time as log as > the scheduler knows how to avoid conflicts and how to display the time > to all participants in an unambiguous way. "Most of the time I want..." or "the naive user wants..." arguments like this are useful for picking a default behavior, and I agree with you that "guess one way or the other" is a better _default_ behavior than "raise an exception." But in my case, I knew precisely what UX I wanted to provide in case of ambiguous/missing times (and in the case of missing times, it did not involve guessing in either direction), and I wanted the underlying library to just clearly tell me about missing/ambiguous times so I could handle them in my preferred way. I wish PEP 495 provided a simpler (opt-in) way to do this. If I schedule an > international conference call, I certainly don't want to see: your > conference time is ambiguous in Uruguay, please pick another time. Of course -- I wouldn't ever choose the "raise an exception if ambiguous" option on the conversion from UTC to another timezone, I would just want the flag set correctly on the resulting datetime. The primary place where the option to raise an exception is useful (at least in my case) is in converting from naive to local time, where the resulting local time is nonexistent. Note > that the goal was to make naively written software "just work" and > produce consistent and defensible results. Not providing any means to > schedule in the second part of the fold is defensible: enjoy free time. > Scheduling a call for 2:30AM when user enters 1:30AM that does not exist > is also defensible. (1 hour and 30 minutes after the midnight on a > spring-forward day is 2:30AM.) A stack trace in the server log due to > an uncaught exception is not a defensible result. Sure. All of these are arguments are about the default behavior. They are not arguments against providing some opt-in mechanism to efficiently discover when a conversion from naive time results in a nonexistent localized time. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Tue Aug 18 22:46:24 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 18 Aug 2015 16:46:24 -0400 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: <55D394D0.9030506@oddbird.net> References: <1439916186887-2ffe88a2-a6c7a661-6fc81adc@mixmax.com> <55D3692A.4030706@stoneleaf.us> <55D36CCA.5070300@oddbird.net> <55D38825.9030801@oddbird.net> <55D394D0.9030506@oddbird.net> Message-ID: On Tue, Aug 18, 2015 at 4:25 PM, Carl Meyer wrote: > > But in my case, I knew precisely what UX I wanted to provide in case of > ambiguous/missing times (and in the case of missing times, it did not > involve guessing in either direction), and I wanted the underlying > library to just clearly tell me about missing/ambiguous times so I could > handle them in my preferred way. I wish PEP 495 provided a simpler > (opt-in) way to do this. Implementation of any specific behavior is 8 lines of code: def local_to_utc(t): t0 = t.replace(first=True).astimezone(timezone.utc) t1 = t.replace(first=False).astimezone(timezone.utc) if t0 == t1: return t0 if t0 < t1: return t0 # or return t1 or raise AmbiguousTimeError if t1 > t0: raise InvalidTimeError # or return t0 or return t1 but in these 8 lines of code, one can implement 9 different behaviors, each suitable for some particular business situation. We cannot know in advance which of the 9 behaviors the users will want, so instead of making a choice for them, we give them the tools to implement any of the 9 choices and possibly more. (The last conditional is redundant, but left for clarity.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Aug 19 00:22:56 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 18 Aug 2015 18:22:56 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> Message-ID: [Alexander Belopolsky] > However, I don't see this additional freedom as a big complication. >> Even in the common case, it may be easier to implement d(x, y) than to >> figure out f(x). The problem with f(x) is that it is the UTC offset >> as a function of local time while most TZ database interfaces only >> provide UTC offset as a function of UTC time. As a result, it is >> often easier to implement d(x, y) (for example, as d(x, y) = g(x) - >> g(y)) than to implement f(x). >> > > [Guido van Rossum] > This discussion sounds overly abstract. ISTM that d(x, y) in timeline > arithmetic can be computed as x.timestamp() - y.timestamp(), (and > converting to a timedelta). > It can be, but currently, x.timestamp() is implemented as (t - datetime(1970, 1, 1, tzinfo=timezone.utc)).total_seconds(), so you end up defining datetime subtraction in terms of datetime subtraction. Let's consider a specific example. Suppose I want to implement a very simple timezone like US/Eastern, where I have some simple rules (with a few historical variations) that given year, month, day, hour and the "first" flag will tell me whether DST is in effect. For such timezone, I can easily write a function in C like this: long long hours_between(int year1, int month1, int day1, int hour1, int first1, int year2, int month2, int day2, int hour2, int first2) { return 24 * (jd(year2, month2, day2) - jd(year1, month1, day1)) + hour2 - dst(year2, month2, day2, hour2, first2) - hour1 + dst(year1, month1, day1, hour1, first1); } where jd and dst are the Julian day and DST functions, each taking under 30 machine cycles to execute. With PEP 500 approach, you have a couple of attribute accesses and unpacking of two datetime buffers between t1 - t2 in Python and the hours_between function, and then you are a few operations with seconds and microseconds and one new_delta call away from the result. [Guido van Rossum] > Similar for adding a datetime and a timedelta. Optimizing this should be > IMO the only question is how should a datetime object choose between > classic arithmetic[1] and timeline arithmetic. My proposal here is to make > that a boolean property of the tzinfo object -- we could either use a > marker subclass or an attribute whose absence implies classic arithmetic. > With this proposal, we will need something like this: def __sub__(self, other): if self.tzinfo is not None and self.tzinfo.strict: self_offset = self.utcoffset() other_offset = other.utcoffset() naive_self = self.replace(tzinfo=None) naive_other = other.replace(tzinfo=None) return naive_self - self_offset - naive_other + other_offset # old logic So we need to create six intermediate Python objects just to do the math. On top of that, we need the utcoffset() method which is a pain to write in C, so we will wrap our optimized dst() function and compute utcoffset() as dst(t) + timedelta(hours=-5), creating four more intermediate Python objects. At the end of the day, I will not be surprised if aware datetime subtraction is 10x slower than naive and every Python textbook recommends to avoid doing arithmetic with aware datetime objects. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Aug 19 01:12:36 2015 From: guido at python.org (Guido van Rossum) Date: Tue, 18 Aug 2015 16:12:36 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> Message-ID: On Tue, Aug 18, 2015 at 3:22 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > [Alexander Belopolsky] > >> However, I don't see this additional freedom as a big complication. >>> Even in the common case, it may be easier to implement d(x, y) than to >>> figure out f(x). The problem with f(x) is that it is the UTC offset >>> as a function of local time while most TZ database interfaces only >>> provide UTC offset as a function of UTC time. As a result, it is >>> often easier to implement d(x, y) (for example, as d(x, y) = g(x) - >>> g(y)) than to implement f(x). >>> >> >> [Guido van Rossum] > >> This discussion sounds overly abstract. ISTM that d(x, y) in timeline >> arithmetic can be computed as x.timestamp() - y.timestamp(), (and >> converting to a timedelta). >> > > It can be, but currently, x.timestamp() is implemented as (t - > datetime(1970, 1, 1, tzinfo=timezone.utc)).total_seconds(), so you end up > defining datetime subtraction in terms of datetime subtraction. > > Let's consider a specific example. Suppose I want to implement a very > simple timezone like US/Eastern, where I have some simple rules (with a few > historical variations) that given year, month, day, hour and the "first" > flag will tell me whether DST is in effect. For such timezone, I can > easily write a function in C like this: > > long long hours_between(int year1, int month1, int day1, int hour1, int > first1, > int year2, int month2, int day2, int hour2, int > first2) > { > return 24 * (jd(year2, month2, day2) - jd(year1, month1, day1)) + > hour2 - dst(year2, month2, day2, hour2, first2) - > hour1 + dst(year1, month1, day1, hour1, first1); > } > > where jd and dst are the Julian day and DST functions, each taking under > 30 machine cycles to execute. With PEP 500 approach, you have a couple of > attribute accesses and unpacking of two datetime buffers between t1 - t2 in > Python and the hours_between function, and then you are a few operations > with seconds and microseconds and one new_delta call away from the result. > > [Guido van Rossum] > >> Similar for adding a datetime and a timedelta. Optimizing this should be >> IMO the only question is how should a datetime object choose between >> classic arithmetic[1] and timeline arithmetic. My proposal here is to make >> that a boolean property of the tzinfo object -- we could either use a >> marker subclass or an attribute whose absence implies classic arithmetic. >> > > With this proposal, we will need something like this: > > def __sub__(self, other): > if self.tzinfo is not None and self.tzinfo.strict: > self_offset = self.utcoffset() > other_offset = other.utcoffset() > naive_self = self.replace(tzinfo=None) > naive_other = other.replace(tzinfo=None) > return naive_self - self_offset - naive_other + other_offset > # old logic > > So we need to create six intermediate Python objects just to do the math. > On top of that, we need the utcoffset() method which is a pain to write in > C, so we will wrap our optimized dst() function and compute utcoffset() as > dst(t) + timedelta(hours=-5), creating four more intermediate Python > objects. At the end of the day, I will not be surprised if aware datetime > subtraction is 10x slower than naive and every Python textbook recommends > to avoid doing arithmetic with aware datetime objects. > > I doubt it. Most textbooks aren't that concerned with saving a few cycles. (Do most Python textbooks even discuss the cost of object creation or function calls?) Anyways, wouldn't PEP 500 be even slower? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Aug 19 01:42:00 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 18 Aug 2015 19:42:00 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> Message-ID: On Tue, Aug 18, 2015 at 7:12 PM, Guido van Rossum wrote: > >> [Guido van Rossum] >> >>> Similar for adding a datetime and a timedelta. Optimizing this should >>> be IMO the only question is how should a datetime object choose between >>> classic arithmetic[1] and timeline arithmetic. My proposal here is to make >>> that a boolean property of the tzinfo object -- we could either use a >>> marker subclass or an attribute whose absence implies classic arithmetic. >>> >> >> With this proposal, we will need something like this: >> >> def __sub__(self, other): >> if self.tzinfo is not None and self.tzinfo.strict: >> self_offset = self.utcoffset() >> other_offset = other.utcoffset() >> naive_self = self.replace(tzinfo=None) >> naive_other = other.replace(tzinfo=None) >> return naive_self - self_offset - naive_other + other_offset >> # old logic >> >> So we need to create six intermediate Python objects just to do the >> math. On top of that, we need the utcoffset() method which is a pain to >> write in C, so we will wrap our optimized dst() function and compute >> utcoffset() as dst(t) + timedelta(hours=-5), creating four more >> intermediate Python objects. At the end of the day, I will not be >> surprised if aware datetime subtraction is 10x slower than naive and every >> Python textbook recommends to avoid doing arithmetic with aware datetime >> objects. >> >> > I doubt it. Most textbooks aren't that concerned with saving a few cycles. > (Do most Python textbooks even discuss the cost of object creation or > function calls?) Anyways, wouldn't PEP 500 be even slower? > I don't think so. We can implement PEP 500 as a C equivalent of def __sub__(self, other): try: try: return self.tzinfo.__datetime_diff__(self, other) except TypeError: # assume other is a timedelta return self.tzinfo.__datetime_sub__(self, other) except AttributeError: # assume missing PDDM pass # old logic and __datetime_diff__ / __datetime_sub__ may be even faster than a single timedelta - timedelta operation because no timezone would support a time span of more than a century or two and can do even microsecond-precision calculations in machine integers. Granted, most timezone implementations will just implement .utcoffset() and inherit the slow __datetime_sub/diff__ implementations from tzstrict, but users who care about a few simple timezones will have an option to optimize those. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed Aug 19 02:00:25 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Tue, 18 Aug 2015 17:00:25 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> Message-ID: <55D3C719.1010503@stoneleaf.us> On 08/18/2015 04:42 PM, Alexander Belopolsky wrote: > [...] and __datetime_diff__ / __datetime_sub__ may be [...] Why are there separate methods for subtracting a timedelta vs a datetime? Seems like one method is sufficient. -- ~Ethan~ From alexander.belopolsky at gmail.com Wed Aug 19 02:30:14 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 18 Aug 2015 20:30:14 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: <55D3C719.1010503@stoneleaf.us> References: <55D34397.9030502@stoneleaf.us> <55D3C719.1010503@stoneleaf.us> Message-ID: On Tue, Aug 18, 2015 at 8:00 PM, Ethan Furman wrote: > On 08/18/2015 04:42 PM, Alexander Belopolsky wrote: > > [...] and __datetime_diff__ / __datetime_sub__ may be [...] >> > > Why are there separate methods for subtracting a timedelta vs a datetime? > Seems like one method is sufficient. > The dispatch based on the type of "other" is already implemented in the datetime, so there is no need to reimplement it in each tzinfo implementation. It is expected that most implementations that override __datetime_add__ will provide __datetime_sub__ which is just __datetime_add__(dt, -delta) or a copy of __datetime_add__ with a few +'s replaced with -'s, so implementing __datetime_sub__ won't be much of an extra burden. Also, while it is not in the PEP yet, I plan to add a C-API specification in which C slots corresponding to __datetime_diff__ and __datetime_sub__ will have different signatures. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Aug 19 02:41:54 2015 From: guido at python.org (Guido van Rossum) Date: Tue, 18 Aug 2015 17:41:54 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> Message-ID: Whatever. :-) On Tue, Aug 18, 2015 at 4:42 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > On Tue, Aug 18, 2015 at 7:12 PM, Guido van Rossum > wrote: > >> >>> [Guido van Rossum] >>> >>>> Similar for adding a datetime and a timedelta. Optimizing this should >>>> be IMO the only question is how should a datetime object choose between >>>> classic arithmetic[1] and timeline arithmetic. My proposal here is to make >>>> that a boolean property of the tzinfo object -- we could either use a >>>> marker subclass or an attribute whose absence implies classic arithmetic. >>>> >>> >>> With this proposal, we will need something like this: >>> >>> def __sub__(self, other): >>> if self.tzinfo is not None and self.tzinfo.strict: >>> self_offset = self.utcoffset() >>> other_offset = other.utcoffset() >>> naive_self = self.replace(tzinfo=None) >>> naive_other = other.replace(tzinfo=None) >>> return naive_self - self_offset - naive_other + other_offset >>> # old logic >>> >>> So we need to create six intermediate Python objects just to do the >>> math. On top of that, we need the utcoffset() method which is a pain to >>> write in C, so we will wrap our optimized dst() function and compute >>> utcoffset() as dst(t) + timedelta(hours=-5), creating four more >>> intermediate Python objects. At the end of the day, I will not be >>> surprised if aware datetime subtraction is 10x slower than naive and every >>> Python textbook recommends to avoid doing arithmetic with aware datetime >>> objects. >>> >>> >> I doubt it. Most textbooks aren't that concerned with saving a few >> cycles. (Do most Python textbooks even discuss the cost of object creation >> or function calls?) Anyways, wouldn't PEP 500 be even slower? >> > > I don't think so. We can implement PEP 500 as a C equivalent of > > def __sub__(self, other): > try: > try: > return self.tzinfo.__datetime_diff__(self, other) > except TypeError: # assume other is a timedelta > return self.tzinfo.__datetime_sub__(self, other) > except AttributeError: # assume missing PDDM > pass > # old logic > > and __datetime_diff__ / __datetime_sub__ may be even faster than a single > timedelta - timedelta operation because no timezone would support a time > span of more than a century or two and can do even microsecond-precision > calculations in machine integers. > > Granted, most timezone implementations will just implement .utcoffset() > and inherit the slow __datetime_sub/diff__ implementations from tzstrict, > but users who care about a few simple timezones will have an option to > optimize those. > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Aug 19 07:52:39 2015 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 19 Aug 2015 00:52:39 -0500 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> Message-ID: [Guido] > ... > This discussion sounds overly abstract. ISTM that d(x, y) in timeline > arithmetic can be computed as x.timestamp() - y.timestamp(), (and converting > to a timedelta). As someone else might say, if you want timestamps, use timestamps ;-) I want to discourage people from thinking of it that way, because it only works in a theoretical framework abstracting away how arithmetic actually behaves. Timestamps in Python suck in a world of floating-point pain that I tried hard to keep entirely out of datetime module semantics (although I see float operations have increasingly wormed their way in). Everyone who thinks about it soon realizes that a datetime simply has "too many bits" to represent faithfully as a Python float, and so also as a Python timestamp. But I think few realize this isn't a problem confined to datetimes only our descendants will experience. It can surprise people even today. For example, here on my second try: >>> d = datetime.now() >>> d datetime.datetime(2015, 8, 18, 23, 8, 54, 615774) >>> datetime.fromtimestamp(d.timestamp()) datetime.datetime(2015, 8, 18, 23, 8, 54, 615773) See? We can't even expect to round-trip faithfully with current datetimes. It's not really that there "aren't enough bits" to represent a current datetime value in a C double, it's that the closest binary float approximating the decimal 1439957334.615774 is strictly less than that decimal value. That causes the microsecond portion to get chopped to 615773 on the way back. It _could_ be rounded instead, which would make roundtripping work for some number of years to come (before it routinely failed again), but rounding would cause other surprises. Anyway, "the right way" to think about timeline arithmetic is the way the sample code in PEP 500 spells it:: using classic datetime arithmetic on datetimes in (our POSIX approximation of) UTC, converting to/from other timezones in the obvious ways There are no surprises then (not after PEP 495-compliant tzinfo objects exist), neither in theory nor in how code actually behaves (leaving aside that the results won't always match real-life clocks). If you want to _think_ of that as being equivalent to arithmetic using theoretical infinitely-precise Python timestamps, that's fine. But it also means you're over 50 years old and the kids will have a hard time understanding you ;-) From 4kir4.1i at gmail.com Wed Aug 19 08:51:12 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Wed, 19 Aug 2015 09:51:12 +0300 Subject: [Datetime-SIG] PEP 495 (Local Time Disambiguation) is ready for pronouncement In-Reply-To: (Tim Peters's message of "Tue, 18 Aug 2015 11:30:25 -0500") References: Message-ID: <87614bixen.fsf@gmail.com> Tim Peters writes: > Related: would people have been happier with the obvious "first" if > it had been named "_first" instead? For the most part it's an > attribute intended to be set "by magic" by various timezone > operations, not something most users need to know anything about > (indeed, for most users it would qualify as an "attractive nuisance"). If you need to parse logs that use local time instead of utc (unfortunately) then you have to use an analog of the "first" flag *explicitly*. Assuming the timestamps are monotonous (if expressed as UTC time) [1] then to convert local time to utc: tz = tzlocal.get_localzone() # get local timezone as pytz tzinfo object for time_string in local_times_from_the_log: naive = datetime.strptime(time_string, "%Y/%m/%d %H:%M:%S") # no timezone try: local = tz.localize(naive, is_dst=None) # attach timezone info except pytz.AmbiguousTimeError: # first, try dst flag that is the same as the previous timestamp local = tz.localize(naive, is_dst=previous.dst()) if previous >= local: # wrong order local = tz.localize(naive, is_dst=not local.dst()) # flip is_dst assert previous < local # timestamps must be increasing previous = local utc_time = local.astimezone(pytz.utc) pytz uses *is_dst* name with the same meaning _for ambiguous times_ as pep-495's *first* flag. [1]: http://stackoverflow.com/questions/26217427/parsing-of-ordered-timestamps-in-local-time-to-utc-while-observing-daylight-sa From ethan at stoneleaf.us Wed Aug 19 10:17:50 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 19 Aug 2015 01:17:50 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> Message-ID: <55D43BAE.7040703@stoneleaf.us> On 08/18/2015 10:52 PM, Tim Peters wrote: > [Guido] >> ... >> This discussion sounds overly abstract. ISTM that d(x, y) in timeline >> arithmetic can be computed as x.timestamp() - y.timestamp(), (and converting >> to a timedelta). > > As someone else might say, if you want timestamps, use timestamps ;-) > > I want to discourage people from thinking of it that way, because it > only works in a theoretical framework abstracting away how arithmetic > actually behaves. Timestamps in Python suck in a world of > floating-point pain that I tried hard to keep entirely out of datetime > module semantics (although I see float operations have increasingly > wormed their way in). > > Everyone who thinks about it soon realizes that a datetime simply has > "too many bits" to represent faithfully as a Python float, and so also > as a Python timestamp. But I think few realize this isn't a problem > confined to datetimes only our descendants will experience. It can > surprise people even today. For example, here on my second try: > >>>> d = datetime.now() >>>> d > datetime.datetime(2015, 8, 18, 23, 8, 54, 615774) >>>> datetime.fromtimestamp(d.timestamp()) > datetime.datetime(2015, 8, 18, 23, 8, 54, 615773) This bug was introduced by Victor Stinner in issue 14180, and is being tracked in issue 23517. Versions 3.3, 3.4, and soon 3.5 are affected. Victor is refusing to fix/revert, Alexander has given up arguing with him, and I lack the necessary skills. Perhaps the best path forward is to deprecate `.timestamp()` and friends. -- ~Ethan~ From guido at python.org Wed Aug 19 17:30:18 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 19 Aug 2015 08:30:18 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> Message-ID: Yes, that's the right way to define it (and PEPs should primarily concern themselves with crisp definitions). Isn't it so that you could get timeline arithmetic today by giving each datetime object a different tzinfo object? On Tue, Aug 18, 2015 at 10:52 PM, Tim Peters wrote: > [Guido] > > ... > > This discussion sounds overly abstract. ISTM that d(x, y) in timeline > > arithmetic can be computed as x.timestamp() - y.timestamp(), (and > converting > > to a timedelta). > > As someone else might say, if you want timestamps, use timestamps ;-) > > I want to discourage people from thinking of it that way, because it > only works in a theoretical framework abstracting away how arithmetic > actually behaves. Timestamps in Python suck in a world of > floating-point pain that I tried hard to keep entirely out of datetime > module semantics (although I see float operations have increasingly > wormed their way in). > > Everyone who thinks about it soon realizes that a datetime simply has > "too many bits" to represent faithfully as a Python float, and so also > as a Python timestamp. But I think few realize this isn't a problem > confined to datetimes only our descendants will experience. It can > surprise people even today. For example, here on my second try: > > >>> d = datetime.now() > >>> d > datetime.datetime(2015, 8, 18, 23, 8, 54, 615774) > >>> datetime.fromtimestamp(d.timestamp()) > datetime.datetime(2015, 8, 18, 23, 8, 54, 615773) > > See? We can't even expect to round-trip faithfully with current > datetimes. It's not really that there "aren't enough bits" to > represent a current datetime value in a C double, it's that the > closest binary float approximating the decimal 1439957334.615774 is > strictly less than that decimal value. That causes the microsecond > portion to get chopped to 615773 on the way back. It _could_ be > rounded instead, which would make roundtripping work for some number > of years to come (before it routinely failed again), but rounding > would cause other surprises. > > Anyway, "the right way" to think about timeline arithmetic is the way > the sample code in PEP 500 spells it:: using classic datetime > arithmetic on datetimes in (our POSIX approximation of) UTC, > converting to/from other timezones in the obvious ways There are no > surprises then (not after PEP 495-compliant tzinfo objects exist), > neither in theory nor in how code actually behaves (leaving aside that > the results won't always match real-life clocks). > > If you want to _think_ of that as being equivalent to arithmetic using > theoretical infinitely-precise Python timestamps, that's fine. But it > also means you're over 50 years old and the kids will have a hard time > understanding you ;-) > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Aug 19 17:31:37 2015 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 19 Aug 2015 10:31:37 -0500 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: <55D43BAE.7040703@stoneleaf.us> References: <55D34397.9030502@stoneleaf.us> <55D43BAE.7040703@stoneleaf.us> Message-ID: [Tim] >> ... >> >>> d = datetime.now() >> >>> d >> datetime.datetime(2015, 8, 18, 23, 8, 54, 615774) >> >>> datetime.fromtimestamp(d.timestamp()) >> datetime.datetime(2015, 8, 18, 23, 8, 54, 615773) [Ethan Furman] > This bug was introduced by Victor Stinner in issue 14180, and is being > tracked in issue 23517. Thanks! I haven't paid much attention for years, and was blissfully unaware of this. Victor's "round by truncation" patch is clearly the cause of this specific roundtrip failure, but because of the fundamental "not enough bits" reality there is no possible way of converting to a C double that won't fail to convert back faithfully _eventually_ (for then-current datetimes obtained in the future, or for future datetimes typed in today). If datetime is extended to support nanoseconds, replace "the future" with "the past" (faithful roundtripping of nanoseconds would require another (approximately) 20 bits in the C double, but microsecond datetimes from 2015 are already near the edge of what can be done with the 53 bits a C double has - even with "better" rounding). > Versions 3.3, 3.4, and soon 3.5 are affected. > Victor is refusing to fix/revert, Alexander has given up arguing with him, > and I lack the necessary skills. > > Perhaps the best path forward is to deprecate `.timestamp()` and friends. There's "a reason" they weren't in the original design ;-) I see there's already tons of discussion on the referenced (& related) bugs reports, so if I have a bright idea I'll put it on those. From carl at oddbird.net Wed Aug 19 17:38:53 2015 From: carl at oddbird.net (Carl Meyer) Date: Wed, 19 Aug 2015 09:38:53 -0600 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> Message-ID: <55D4A30D.30909@oddbird.net> On 08/19/2015 09:30 AM, Guido van Rossum wrote: > Yes, that's the right way to define it (and PEPs should primarily > concern themselves with crisp definitions). > > Isn't it so that you could get timeline arithmetic today by giving each > datetime object a different tzinfo object? Yes. In fact, this is exactly how pytz works around it. With pytz, every tzinfo _instance_ is a fixed-offset instance (e.g. there are distinct tzinfo instances for EST and EDT). This means if you do arithmetic using an EST datetime and an EDT datetime, you'll get (timeline-arithmetic) correct results. The downside of this approach (though once you get used to it it's at least predictable and explicit) is that if you do arithmetic using e.g. an EST datetime and a timedelta, you'll always end up with an EST result, even if the result crossed a transition to EDT (so it's sort of an "imaginary extension of EST"), and you have to remember to use a pytz-specific "normalize()" API, which will adjust the datetime by an hour and switch the tzinfo from EST to EDT. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From guido at python.org Wed Aug 19 17:55:13 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 19 Aug 2015 08:55:13 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: <55D4A30D.30909@oddbird.net> References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> Message-ID: On Wed, Aug 19, 2015 at 8:38 AM, Carl Meyer wrote: > On 08/19/2015 09:30 AM, Guido van Rossum wrote: > > Yes, that's the right way to define it (and PEPs should primarily > > concern themselves with crisp definitions). > > > > Isn't it so that you could get timeline arithmetic today by giving each > > datetime object a different tzinfo object? > > Yes. In fact, this is exactly how pytz works around it. With pytz, every > tzinfo _instance_ is a fixed-offset instance (e.g. there are distinct > tzinfo instances for EST and EDT). This means if you do arithmetic using > an EST datetime and an EDT datetime, you'll get (timeline-arithmetic) > correct results. The downside of this approach (though once you get used > to it it's at least predictable and explicit) is that if you do > arithmetic using e.g. an EST datetime and a timedelta, you'll always end > up with an EST result, even if the result crossed a transition to EDT > (so it's sort of an "imaginary extension of EST"), and you have to > remember to use a pytz-specific "normalize()" API, which will adjust the > datetime by an hour and switch the tzinfo from EST to EDT. > That does not sound like what I was proposing (as a hack) -- it simply exchanges one bug for another. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Aug 19 17:55:55 2015 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 19 Aug 2015 10:55:55 -0500 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> Message-ID: [Guido ] > Yes, that's the right way to define it (and PEPs should primarily concern > themselves with crisp definitions). > > Isn't it so that you could get timeline arithmetic today by giving each > datetime object a different tzinfo object? For datetime - datetime subtraction, yes, and that's always been so. The Python code for that is mildly instructive here, since it doesn't actually convert anything to a UTC timezone, nor does it do any arithmetic directly _on_ a datetime: def __sub__(self, other): ... days1 = self.toordinal() days2 = other.toordinal() secs1 = self._second + self._minute * 60 + self._hour * 3600 secs2 = other._second + other._minute * 60 + other._hour * 3600 base = timedelta(days1 - days2, secs1 - secs2, self._microsecond - other._microsecond) if self._tzinfo is other._tzinfo: return base myoff = self.utcoffset() otoff = other.utcoffset() if myoff == otoff: return base if myoff is None or otoff is None: raise TypeError("cannot mix naive and timezone-aware time") return base + otoff - myoff It's "for speed" that it uses timedelta arithmetic, bashing the datetimes into timedelta's days+seconds+microseconds format first (which is equivalent to using "bigint" timestamps counting microseconds since 1 Jan 1!).. Then instead of converting zones, it just adjusts the result (when appropriate) by the difference between the inputs' UTC offsets. So the kinds of things PEP 500 shows are, appropriately, the simplest ways of spelling out the intents. The implementations may look very different - provided they behave the same way. From alexander.belopolsky at gmail.com Wed Aug 19 18:15:55 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Aug 2015 12:15:55 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> Message-ID: On Wed, Aug 19, 2015 at 11:55 AM, Guido van Rossum wrote: >> >> > Isn't it so that you could get timeline arithmetic today by giving each >> > datetime object a different tzinfo object? >> >> Yes. In fact, this is exactly how pytz works around it. With pytz, every >> tzinfo _instance_ is a fixed-offset instance (e.g. there are distinct >> tzinfo instances for EST and EDT). This means if you do arithmetic using >> an EST datetime and an EDT datetime, you'll get (timeline-arithmetic) >> correct results. The downside of this approach (though once you get used >> to it it's at least predictable and explicit) is that if you do >> arithmetic using e.g. an EST datetime and a timedelta, you'll always end >> up with an EST result, even if the result crossed a transition to EDT >> (so it's sort of an "imaginary extension of EST"), and you have to >> remember to use a pytz-specific "normalize()" API, which will adjust the >> datetime by an hour and switch the tzinfo from EST to EDT. > > > That does not sound like what I was proposing (as a hack) -- it simply > exchanges one bug for another. No, it is not a bug. Just a slight inconvenience for the programmer. Here is how you do strict timeline arithmetics in local time using Python 3.3+: >>> t = datetime(2015, 3, 7, 17, tzinfo=timezone.utc) >>> lt = t.astimezone() >>> print(lt) 2015-03-07 12:00:00-05:00 Good - we've got an aware local time. >>> lt += timedelta(1) >>> print(lt) 2015-03-08 12:00:00-05:00 This is not a wrong answer. "2015-03-08 12:00:00-05:00" is indeed 24 hours after "2015-03-07 12:00:00-05:00". The only problem is that people in the US/Eastern timezone don't spell it that way: they already use summer time (-04:00) on 2015-03-08. However, fixing this is one method call away: >>> print(lt.astimezone()) 2015-03-08 13:00:00-04:00 All the "timeline arithmetic" gets you is the ability to write correct code without that one extra call. From chris.barker at noaa.gov Wed Aug 19 18:21:29 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 19 Aug 2015 09:21:29 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> Message-ID: On Tue, Aug 18, 2015 at 4:12 PM, Guido van Rossum wrote: > At the end of the day, I will not be surprised if aware datetime >> subtraction is 10x slower than naive and every Python textbook recommends >> to avoid doing arithmetic with aware datetime objects. >> >> > I doubt it. Most textbooks aren't that concerned with saving a few cycles. > (Do most Python textbooks even discuss the cost of object creation or > function calls?) Anyways, wouldn't PEP 500 be even slower? > and the folks that DO care about performance will probaly want to use numpy's datetime64 anyway :-) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Aug 19 18:31:23 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Aug 2015 12:31:23 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> Message-ID: On Wed, Aug 19, 2015 at 12:21 PM, Chris Barker wrote: > and the folks that DO care about performance will probaly want to use > numpy's datetime64 anyway :-) Yes, numpy's datetime64: "get wrong answers 1000x faster!" -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Wed Aug 19 18:47:38 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 19 Aug 2015 09:47:38 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> Message-ID: <55D4B32A.4020506@stoneleaf.us> On 08/19/2015 09:15 AM, Alexander Belopolsky wrote: > On Wed, Aug 19, 2015 at 11:55 AM, Guido van Rossum wrote: >> That does not sound like what I was proposing (as a hack) -- it simply >> exchanges one bug for another. > > No, it is not a bug. Po-tay-to, po-tah-to. ;) From my point of view, a time represented by the opposite DST setting is invalid -- and I guarantee it will trip somebody up: most folks don't expect a local time to be in the opposite DST setting (and yes, an aware time is still local to the people in that zone). -- ~Ethan~ From guido at python.org Wed Aug 19 18:53:49 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 19 Aug 2015 09:53:49 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> Message-ID: On Wed, Aug 19, 2015 at 9:15 AM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > On Wed, Aug 19, 2015 at 11:55 AM, Guido van Rossum > wrote: > >>[...] > > That does not sound like what I was proposing (as a hack) -- it simply > > exchanges one bug for another. > > No, it is not a bug. > That's a ridiculously narrow point of view. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Aug 19 19:22:31 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Aug 2015 13:22:31 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> Message-ID: On Wed, Aug 19, 2015 at 12:53 PM, Guido van Rossum wrote: > On Wed, Aug 19, 2015 at 9:15 AM, Alexander Belopolsky < > alexander.belopolsky at gmail.com> wrote: > >> On Wed, Aug 19, 2015 at 11:55 AM, Guido van Rossum >> wrote: >> >>[...] >> > That does not sound like what I was proposing (as a hack) -- it simply >> > exchanges one bug for another. >> >> No, it is not a bug. >> > > That's a ridiculously narrow point of view. > > Well, >>> print(lt) 2015-03-07 12:00:00-05:00 >>> print(lt + timedelta(1)) 2015-03-08 12:00:00-05:00 Is as much a bug as >>> sum([0.1] * 10) == 1 False The code works exactly as designed and documented. The fact that it can be used to write buggy programs does not mean that there is a bug in the library. The same library can be used to write correct programs and I have shown how. I don't see what is so "narrow" in my point of view. Can you elaborate? Regardless of the software involved, if I give you an ISO 8601 string "2015-03-07 12:00:00-0500" and ask you: What time will be 24 hours after that? I bet your answer will be "2015-03-08 12:00:00-0500." And I don't think you will appreciate being ridiculed for not knowing that the "correct" answer is "2015-03-08 13:00:00-0400." After all, Python does not really care: >>> fmt = "%Y-%m-%d %H:%M:%S%z" >>> datetime.strptime("2015-03-08 13:00:00-0400", fmt) == datetime.strptime("2015-03-08 12:00:00-0500", fmt) True -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Aug 19 19:23:11 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 19 Aug 2015 10:23:11 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> Message-ID: IIUC, PEP 500 essentially says essentially anything that datetime does can be delegated to a tzinfo object. Which reduces the datetime object to a simple container of a datetime stamp: years, months, days, hours, min, sec, microsec. As the current implementation has a way to add on tzinfo object, it is a way forward to make them all-powerful, to add arbitrary functionality without changing the way an existing code will work -- but it it the only or best way? I think it would be very helpful (maybe only to me) to spell out the goals of PEP 500, so we can determine if they are goals we want to support, and if PEP 500 is the best way to support them: I see a number of different goals, all crammed in: - Support different Calendars : anything other than Proleptic Gregorian Calendar -- maybe include Lunar Calendars, etc??? - Support leap-seconds - this strikes me as essentially a slightly different Calendar -- aside from having to use an updated database, it's very similar to leap years. - Support different time arithmetic -- "Duration" and "Period" arithmetic -- or "strict", or ???? I think it's a fine idea to open the door to support all these, but is PEP 500 the way to do it? Anything else? We now have a few particular objects to work with: time date datetime timedelta tzinfo These each have their own purpose and protocol -- can we leverage that to support the above? *tzinfo* essentially provides the offset to/from UTC time for a given timezone at a given datetime. PEP 495 adds a feature that completes the ability to fully support this functionality -- why not keep that clean? *timedelta* is essentially a way to encode a time duration -- microseconds. Useful and simple. *datetime* encodes a timestamp, and provides an implementation of the proleptic gregorian calendar, so that it can convert differences between datetimes to "real" timespans -- i.e. timedeltas. Thus is can support subtracting datetimes, and adding timedeltas to a datetime. OK -- so given all this (and Tim, please correct me where I have it wrong, -- I probably do), how best to support the goals above? (and, of course not break any existing code) **Arithmetic:** We've identified two "kinds" of arithmetic -- 'Duration' -- actual seconds, and "Period", timestamps are important -- i.e., "the next day, same time", etc...). Currently strict arithmetic is not supported by "aware" datetime objects, but neither is much in the way of Period Arithmetic. - *Period arithmetic*: As I understand it, this is pretty well supported by dateutils right now -- are the datetuitls maintainers asking for anything to make this better / easier??? Also, Period arithmetic requires all sorts of things other than simple addition and subtraction -- "Next Tuesday", "next business day", who knows what? so it seems overloading __add__ and __sub__ doesn't really buy much anyway. - *Duration arithmetic*: I think this is the most useful thing to add -- currently, if you have tz-aware datetimes, you have to convert both to UTC, do the math, and convert back to the timezone you want. This isn't too painful, and is considered best practice by some folks anyway (actually, best practice is to convert to UTC on I/O, and always use UTC internally). But despite best practices, sometimes someone simply wants to do it in the time zone they are in. And I suspect there is code out there that does a simple subtraction, and it works fine if they haven't crosses a DST border, so they haven't found the bug. -- so how to add Duration arithmetic? Since this is currently handled by datetime, that seems like the obvious place to put it. Either with a subclass, or, my preference, with a attribute that tells you want kind of arithmetic you want, which would, of course, default to the current behavior. The trick here is that if one were to subtract two datetimes with the flag set differently, you'd have to decide which to respect -- but we could document which takes precedence. And this is the same problem as when you have two datetime with different tzinfo implementations. - There was talk of having multiple kinds time deltas, which might represent either Durations or Periods, but as timedelta only supports Periods that map precisely and unambiguously to a particular Duration, that would be a much bigger API change to do anything useful. And probably not be the way to go anyway, as mapping all the kinds of Period arithmetic you want to binary operations isn't practical. - Also -- one could make a new datetime object that did the same thing as the current one, but stored the timestamp as a "time_span_since_an_epoch", to get better performance for Duration arithmetic, while sacrificing performance for pulling out the human-readable representation. I don't know that anyone would do that, but it would be a way to go, and I don't think would be do-able by delegating to the tzinfo object. ** Different Calendars ** So how to handle different Calendars? -- again, the Calendar implementation is in datetime now, so subclassing datetime makes the most sense to me. It could be subclassed to support leap seconds, for instance, and all the rest of the machinery would work fine: timedeltas, time, tzinfo objects. Also, if you want to get really far out, then lunar calendars, etc, aren't suited to the year, month, day system currently used by datetime, so you'd have to re-implement that anyway -- it couldn't be crammed into a tzinfo object -- at least without a lot of pain. And implementing a new Calendar with a new duck-typed datetime object would require no changes to the std lib -- so nothing to argue about here :-) So all this reduces to one maybe-proposal for the stdlib: add a flag to datetime specifying whether you want "Duration" Arithmetic, rather than the current "naive" arithmetic. I know on this list at some point someone suggested that the "strict" flag go in the tzinfo object, but I can't see why it should be there, other than that we're messing with that object in PEP 495 already. So, in short: I don't think the The PEP 500 "delegate everything to tzinfo objects" approach is the way to go. Python already has subclassing, when you want different behaviour of an object. But in any case, if everyone else thinks it's the way to go, then it needs an explanation for why it's better than putting the new functionality in datetime subclasses, or duck-typed classes. Or, for that matter, what I think Guido is suggesting -- a totally different datetime module. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Aug 19 19:25:43 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 19 Aug 2015 10:25:43 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> Message-ID: Rather than prolonging the debate, let me just reject PEP 500. If you want to use datetime objects just as containers, you can implement a bunch of functions that manipulate them the way you want. On Wed, Aug 19, 2015 at 10:23 AM, Chris Barker wrote: > IIUC, PEP 500 essentially says essentially anything that datetime does can > be delegated to a tzinfo object. Which reduces the datetime object to a > simple container of a datetime stamp: > > years, months, days, hours, min, sec, microsec. > > As the current implementation has a way to add on tzinfo object, it is a > way forward to make them all-powerful, to add arbitrary functionality > without changing the way an existing code will work -- but it it the only > or best way? > > > I think it would be very helpful (maybe only to me) to spell out the goals > of PEP 500, so we can determine if they are goals we want to support, and > if PEP 500 is the best way to support them: > > > I see a number of different goals, all crammed in: > > - Support different Calendars : anything other than Proleptic Gregorian > Calendar -- maybe include Lunar Calendars, etc??? > > - Support leap-seconds - this strikes me as essentially a slightly > different Calendar -- aside from having to use an updated database, it's > very similar to leap years. > > - Support different time arithmetic -- "Duration" and "Period" arithmetic > -- or "strict", or ???? > > I think it's a fine idea to open the door to support all these, but is PEP > 500 the way to do it? > > Anything else? > > We now have a few particular objects to work with: > > time > date > datetime > timedelta > tzinfo > > These each have their own purpose and protocol -- can we leverage that to > support the above? > > *tzinfo* essentially provides the offset to/from UTC time for a given > timezone at a given datetime. PEP 495 adds a feature that completes the > ability to fully support this functionality -- why not keep that clean? > > *timedelta* is essentially a way to encode a time duration -- > microseconds. Useful and simple. > > *datetime* encodes a timestamp, and provides an implementation of the > proleptic gregorian calendar, so that it can convert differences between > datetimes to "real" timespans -- i.e. timedeltas. Thus is can support > subtracting datetimes, and adding timedeltas to a datetime. > > OK -- so given all this (and Tim, please correct me where I have it wrong, > -- I probably do), how best to support the goals above? (and, of course not > break any existing code) > > **Arithmetic:** > > We've identified two "kinds" of arithmetic -- 'Duration' -- actual > seconds, and "Period", timestamps are important -- i.e., "the next day, > same time", etc...). Currently strict arithmetic is not supported by > "aware" datetime objects, but neither is much in the way of Period > Arithmetic. > > - *Period arithmetic*: As I understand it, this is pretty well supported > by dateutils right now -- are the datetuitls maintainers asking for > anything to make this better / easier??? Also, Period arithmetic requires > all sorts of things other than simple addition and subtraction -- "Next > Tuesday", "next business day", who knows what? so it seems overloading > __add__ and __sub__ doesn't really buy much anyway. > > - *Duration arithmetic*: I think this is the most useful thing to add -- > currently, if you have tz-aware datetimes, you have to convert both to UTC, > do the math, and convert back to the timezone you want. This isn't too > painful, and is considered best practice by some folks anyway (actually, > best practice is to convert to UTC on I/O, and always use UTC internally). > But despite best practices, sometimes someone simply wants to do it in the > time zone they are in. And I suspect there is code out there that does a > simple subtraction, and it works fine if they haven't crosses a DST border, > so they haven't found the bug. > > -- so how to add Duration arithmetic? Since this is currently handled by > datetime, that seems like the obvious place to put it. Either with a > subclass, or, my preference, with a attribute that tells you want kind of > arithmetic you want, which would, of course, default to the current > behavior. The trick here is that if one were to subtract two datetimes with > the flag set differently, you'd have to decide which to respect -- but we > could document which takes precedence. And this is the same problem as when > you have two datetime with different tzinfo implementations. > > - There was talk of having multiple kinds time deltas, which might > represent either Durations or Periods, but as timedelta only supports > Periods that map precisely and unambiguously to a particular Duration, that > would be a much bigger API change to do anything useful. And probably not > be the way to go anyway, as mapping all the kinds of Period arithmetic you > want to binary operations isn't practical. > > - Also -- one could make a new datetime object that did the same thing as > the current one, but stored the timestamp as a "time_span_since_an_epoch", > to get better performance for Duration arithmetic, while sacrificing > performance for pulling out the human-readable representation. I don't know > that anyone would do that, but it would be a way to go, and I don't think > would be do-able by delegating to the tzinfo object. > > > ** Different Calendars ** > So how to handle different Calendars? -- again, the Calendar > implementation is in datetime now, so subclassing datetime makes the most > sense to me. It could be subclassed to support leap seconds, for instance, > and all the rest of the machinery would work fine: timedeltas, time, > tzinfo objects. > > Also, if you want to get really far out, then lunar calendars, etc, aren't > suited to the year, month, day system currently used by datetime, so you'd > have to re-implement that anyway -- it couldn't be crammed into a tzinfo > object -- at least without a lot of pain. > > And implementing a new Calendar with a new duck-typed datetime object > would require no changes to the std lib -- so nothing to argue about here > :-) > > So all this reduces to one maybe-proposal for the stdlib: add a flag to > datetime specifying whether you want "Duration" Arithmetic, rather than the > current "naive" arithmetic. > > I know on this list at some point someone suggested that the "strict" flag > go in the tzinfo object, but I can't see why it should be there, other than > that we're messing with that object in PEP 495 already. > > So, in short: > > I don't think the The PEP 500 "delegate everything to tzinfo objects" > approach is the way to go. Python already has subclassing, when you want > different behaviour of an object. But in any case, if everyone else thinks > it's the way to go, then it needs an explanation for why it's better than > putting the new functionality in datetime subclasses, or duck-typed > classes. Or, for that matter, what I think Guido is suggesting -- a totally > different datetime module. > > -Chris > > > -- > > Christopher Barker, Ph.D. > Oceanographer > > Emergency Response Division > NOAA/NOS/OR&R (206) 526-6959 voice > 7600 Sand Point Way NE (206) 526-6329 fax > Seattle, WA 98115 (206) 526-6317 main reception > > Chris.Barker at noaa.gov > > _______________________________________________ > Datetime-SIG mailing list > Datetime-SIG at python.org > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Aug 19 19:27:34 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 19 Aug 2015 10:27:34 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> Message-ID: On Wed, Aug 19, 2015 at 10:22 AM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > Regardless of the software involved, if I give you an ISO 8601 string > "2015-03-07 12:00:00-0500" and ask you: What time will be 24 hours after > that? I bet your answer will be "2015-03-08 12:00:00-0500." And I don't > think you will appreciate being ridiculed for not knowing that the > "correct" answer is "2015-03-08 13:00:00-0400." > But that isn't, and can't be the "correct" answer. ISO 8601 does not have a way to encode timezones. So that string can only mean a particular time in UTC. Unless there is a specification of the timezone somewhere else. But this is about how to interpret ISO 8601 strings, so really a different question anyway. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Aug 19 19:28:38 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 19 Aug 2015 10:28:38 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> Message-ID: On Wed, Aug 19, 2015 at 10:25 AM, Guido van Rossum wrote: > Rather than prolonging the debate, let me just reject PEP 500. If you want > to use datetime objects just as containers, you can implement a bunch of > functions that manipulate them the way you want. > Ahh, and right after I wrote that big long message. ;-) -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Aug 19 19:51:35 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Aug 2015 13:51:35 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> Message-ID: On Wed, Aug 19, 2015 at 1:23 PM, Chris Barker wrote: > IIUC, PEP 500 essentially says essentially anything that datetime does can > be delegated to a tzinfo object. Which reduces the datetime object to a > simple container of a datetime stamp: > > years, months, days, hours, min, sec, microsec. > > No, it is more than that: you forgot the tzinfo. :-) However without tzinfo, it is a container plus the naive arithmetic, plus simple formatting and parsing, plus conversions to and from various other forms - timetuple, timestamp, etc. The 5000+ lines of the datetime library code are not going anywhere with PEP 500. > As the current implementation has a way to add on tzinfo object, it is a > way forward to make them all-powerful, to add arbitrary functionality > without changing the way an existing code will work -- but it it the only > or best way? > IMO, this is the most straightforward way. The tzinfo object implementation is where you ultimately have the information you need to implement the timeline arithmetics. To do that in datetime object, you need a way to communicate the information from the tzinfo object to datetime. The current interface that consists of .utcoffset() .dst() and optionally .fromutc() methods becomes your bottleneck because a well-written __datetime_sub__ method can get you the result faster than the two calls to .utcoffset(). Look at the typical implementation of the C mktime() function: < https://github.com/lattera/glibc/blob/master/time/mktime.c#L343>. The reason it is so ridiculously complicated is that it has a needle hole view of the timezone data. You have a similar situation when your needle hole is the tzinfo interface. You have to forgo all the simplifications that are possible when you know the specifics of your timezone. Even if your timezone is identical to UTC, you still end up adding and subtracting timedelta(0) on each arithmetic operation. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Aug 19 22:13:08 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Aug 2015 16:13:08 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> Message-ID: On Wed, Aug 19, 2015 at 1:27 PM, Chris Barker wrote: > ISO 8601 does not have a way to encode timezones. I don't have the official $300+ PDF of the standard, but wikipedia disagrees with you: < https://en.wikipedia.org/wiki/ISO_8601#Time_zone_designators>. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Wed Aug 19 22:30:07 2015 From: carl at oddbird.net (Carl Meyer) Date: Wed, 19 Aug 2015 14:30:07 -0600 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> Message-ID: <55D4E74F.7050501@oddbird.net> On 08/19/2015 02:13 PM, Alexander Belopolsky wrote: > > On Wed, Aug 19, 2015 at 1:27 PM, Chris Barker > wrote: > > ISO 8601 does not have a way to encode timezones. > > > I don't have the official $300+ PDF of the standard, but wikipedia > disagrees with you: > . I think Chris is distinguishing between "timezones" and "UTC offsets". ISO 8601 can certainly encode a UTC offset, but it can't encode a timezone in the sense of "America/New_York". Which is why it's not incorrect at all to say that "2015-03-07 12:00:00-0500" plus 24 hours is "2015-03-08 12:00:00-0500." To claim it should instead be "2015-03-08 13:00:00-0400" is making an unjustified assumption that "UTC-0500" must only mean "America/New_York, not during DST." On the other hand, it's at least arguably incorrect to say that "12:00 March 7 2015 EST" plus 24 hours is "12:00 March 8 2015 EST" rather than "13:00 March 8 2015 EDT". When you're working in terms of a timezone rather than just a UTC offset, there's a reasonable expectation to handle that timezone's DST transitions. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From chris.barker at noaa.gov Wed Aug 19 22:54:19 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 19 Aug 2015 13:54:19 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: <55D4E74F.7050501@oddbird.net> References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> Message-ID: On Wed, Aug 19, 2015 at 1:30 PM, Carl Meyer wrote: > > I don't have the official $300+ PDF of the standard neither do I :-( > , but wikipedia > > disagrees with you: > > . > > I think Chris is distinguishing between "timezones" and "UTC offsets". > ISO 8601 can certainly encode a UTC offset, but it can't encode a > timezone in the sense of "America/New_York". > exactly. However, the Wikipedia page does say: "Time zones in ISO 8601 are represented as local time (with the location unspecified), as UTC , or as an offset from UTC." but nothing about how one specifies the location -- I have no idea if there is an ISO 8601 way to specify a location, but I've never seen it -- wikipedia may mean that it should be specified some other way than embedded in the string. but it's not reliable to reverse-engineer and offset to find the timezone. though I don't think this really has anything to do with the topic at ahnd -- is anyone proposing that parsing ISO strings _should_ assume a time zone? BTW, an ISO string without an offset (or z, or...) is often (officially) called "local time". So numpy's datetime takes that to mean the timezone of the machine's locale settings -- which is a really ugly mess, so let's not do that. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Aug 19 23:24:41 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Aug 2015 17:24:41 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: <55D4E74F.7050501@oddbird.net> References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> Message-ID: On Wed, Aug 19, 2015 at 4:30 PM, Carl Meyer wrote: > > On 08/19/2015 02:13 PM, Alexander Belopolsky wrote: > > > > On Wed, Aug 19, 2015 at 1:27 PM, Chris Barker > > wrote: > > > > ISO 8601 does not have a way to encode timezones. > > > > > > I don't have the official $300+ PDF of the standard, but wikipedia > > disagrees with you: > > . > > I think Chris is distinguishing between "timezones" and "UTC offsets". > ISO 8601 can certainly encode a UTC offset, but it can't encode a > timezone in the sense of "America/New_York". I did understand that, of course, but general claims such as "ISO 8601 does not have a way to encode timezones" misrepresent the actual state of affairs. Similarly, claims such as "Python datetime module does not support timeline calculations with aware datetime instances" are wrong. People who make them are likely unaware of the features added to the datetime module in the 3.x series. > > Which is why it's not incorrect at all to say that "2015-03-07 > 12:00:00-0500" plus 24 hours is "2015-03-08 12:00:00-0500." To claim it > should instead be "2015-03-08 13:00:00-0400" is making an unjustified > assumption that "UTC-0500" must only mean "America/New_York, not during > DST." Actually, neither answer is wrong because "2015-03-08 12:00:00-0500" and "2015-03-08 13:00:00-0400" is the same time and most computers (and majority of humans) understand that. Arguing over which answer is more correct is like arguing whether 1 + 1 is 2 or 0b10. > > > On the other hand, it's at least arguably incorrect to say that "12:00 > March 7 2015 EST" plus 24 hours is "12:00 March 8 2015 EST" rather than > "13:00 March 8 2015 EDT". When you're working in terms of a timezone > rather than just a UTC offset, there's a reasonable expectation to > handle that timezone's DST transitions. Here is how IETF Network Working Group defines a "Time Zone": 3.1. Time Zone A description of the past and predicted future timekeeping practices of a collection of clocks that are intended to agree. Note that the term "time zone" does not have the common meaning of a region of the world at a specific UTC offset, possibly modified by daylight saving time. For example, the "Central European Time" zone can correspond to several time zones "Europe/Berlin", "Europe/Paris", etc., because subregions have kept time differently in the past. By this definition, EST, EDT, MSK, EEST are all "time zones" with a particularly simple "description of timekeeping practices." For example EST can be described as keeping time in UTC minus 5 hours. Most computers implement such "time zones" - just set the environment variable TZ to EST and enjoy Eastern Standard Time all year round. The POSIX standard allows you to describe more complicated timezones TZ=EST+05EDT,M3.2.0,M11.1.0 more or less corresponds to the current US/Eastern DST rules. You can use that in your lab and all your clocks will agree with each other, but possibly not with the world outside. Note that the detailed descriptions of historical timekeeping practices such as Olson's America/New_York zone file typically refer to the simple "time zones": $ zdump -v America/New_York | grep 2015 America/New_York Sun Mar 8 06:59:59 2015 UTC = Sun Mar 8 01:59:59 2015 EST isdst=0 America/New_York Sun Mar 8 07:00:00 2015 UTC = Sun Mar 8 03:00:00 2015 EDT isdst=1 America/New_York Sun Nov 1 05:59:59 2015 UTC = Sun Nov 1 01:59:59 2015 EDT isdst=1 America/New_York Sun Nov 1 06:00:00 2015 UTC = Sun Nov 1 01:00:00 2015 EST isdst=0 which means: in 2015 use EST until March 8, the use EDT until November 1 and then go back to EST. This is just a complicated set of rules broken into a series of simple "subtract X hours from UTC" rules. Python's datetime module contains a fairly complete support for "simple" time zones. This support is provided by the datetime.timezone class. If there are cases where datetime.timezone class does not work as documented, I would like to hear about those. You can argue that datetime.timezone is not a "time zone", but rather an "offset", or you can argue that "EST" is not a time zone, but an "abbreviation" (whatever that means), but the fact that IETF Network Working Group definition of time zone and Python's tzinfo class consider simple time zones and geographical time zones to be instances of the same concept suggests that this is the right approach. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Aug 19 23:47:28 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Aug 2015 17:47:28 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> Message-ID: On Wed, Aug 19, 2015 at 4:54 PM, Chris Barker wrote: > > However, the Wikipedia page does say: > > "Time zones in ISO 8601 are represented as local time (with the location unspecified), as UTC, or as an offset from UTC." > > but nothing about how one specifies the location -- I have no idea if there is an ISO 8601 way to specify a location, but I've never seen it -- wikipedia may mean that it should be specified some other way than embedded in the string. Correct, but specifying the location has nothing to do with the computer notion of a time zone. The IETF experts working on the Time Zone Data Distribution standard explain that right in the introduction: Note that the term "time zone" does not have the common meaning of a region of the world at a specific UTC offset, possibly modified by daylight saving time. (I quoted this part already in the message that crossed with yours.) See < http://datatracker.ietf.org/doc/draft-ietf-tzdist-service> for details. > though I don't think this really has anything to do with the topic at ahnd -- is anyone proposing that parsing ISO strings _should_ assume a time zone? No, but only because this is already part of the standard library: >>> from datetime import datetime >>> datetime.strptime("2015-03-08 12:00:00-0500", "%Y-%m-%d %H:%M:%S%z") datetime.datetime(2015, 3, 8, 12, 0, tzinfo=datetime.timezone(datetime.timedelta(-1, 68400))) -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Wed Aug 19 23:59:38 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Wed, 19 Aug 2015 14:59:38 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> Message-ID: On Wed, Aug 19, 2015 at 2:47 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > Note that the term "time zone" does not have the common meaning of a > region of the world at a specific UTC offset, possibly modified by > daylight saving time. > sigh -- tangled up in semantics again. we really need a glossary! No, but only because this is already part of the standard library: > > >>> from datetime import datetime > >>> datetime.strptime("2015-03-08 12:00:00-0500", "%Y-%m-%d %H:%M:%S%z") > datetime.datetime(2015, 3, 8, 12, 0, > tzinfo=datetime.timezone(datetime.timedelta(-1, 68400))) > semantics again -- this does indeed create an "aware" timezone, one with a tzinfo object with a fixed offset -- which is what ISO 8601 says it means. But anyway, I think we've only gotten tangled up in semantics here -- I haven't seen anything proposed about this that I have a problem with. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Thu Aug 20 00:30:33 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Aug 2015 18:30:33 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> Message-ID: On Wed, Aug 19, 2015 at 5:59 PM, Chris Barker wrote: > >> >>> from datetime import datetime >> >>> datetime.strptime("2015-03-08 12:00:00-0500", "%Y-%m-%d %H:%M:%S%z") >> datetime.datetime(2015, 3, 8, 12, 0, >> tzinfo=datetime.timezone(datetime.timedelta(-1, 68400))) >> > > semantics again -- this does indeed create an "aware" timezone, one with a > tzinfo object with a fixed offset -- which is what ISO 8601 says it means. > > But anyway, I think we've only gotten tangled up in semantics here -- I > haven't seen anything proposed about this that I have a problem with. > Good, so you don't have a problem understanding what "2015-03-08 12:00:00-0500" is. Even thought New Yorkers switched to summer time early morning of March 8, right? Want to know what time it was in New York? (Assuming you are not in New York): >>> os.environ['TZ'] = 'America/New_York' >>> datetime.strptime("2015-03-08 12:00:00-0500", "%Y-%m-%d %H:%M:%S%z").astimezone().isoformat() '2015-03-08T13:00:00-04:00' Want to know the same time in Sydney, Australia - be my guest: >>> os.environ['TZ'] = 'Australia/Sydney' >>> datetime.strptime("2015-03-08 12:00:00-0500", "%Y-%m-%d %H:%M:%S%z").astimezone().isoformat() '2015-03-09T04:00:00+11:00' If we all understand that '2015-03-08T12:00:00-05:00', '2015-03-08T13:00:00-04:00' and '2015-03-09T04:00:00+11:00' are different spellings of the same time, where is a bug in the following calculation? >>> print(lt) 2015-03-07 12:00:00-05:00 >>> lt += timedelta(1) >>> print(lt) 2015-03-08 12:00:00-05:00 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Thu Aug 20 01:08:52 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Wed, 19 Aug 2015 16:08:52 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> Message-ID: <55D50C84.1080005@stoneleaf.us> On 08/19/2015 03:30 PM, Alexander Belopolsky wrote: > If we all understand that '2015-03-08T12:00:00-05:00', '2015-03-08T13:00:00-04:00' > and '2015-03-09T04:00:00+11:00' are different spellings of the same time, where is > a bug in the following calculation? > > >>> print(lt) > 2015-03-07 12:00:00-05:00 > >>> lt += timedelta(1) > >>> print(lt) > 2015-03-08 12:00:00-05:00 Well, let's say I live in New York, so all winter long I've been seeing things like "2015-01-17 9:37:51-05:00", etc. and then the time switches in March and I fail to notice that the "-05:00" is still "-05:00" and not "-04:00" -- especially since my watch, clock, smart phone, etc., don't display the offset -- well, something bad will happen: exactly what depends on what the user was expecting when adding a day. -- ~Ethan~ From alexander.belopolsky at gmail.com Thu Aug 20 01:21:04 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Aug 2015 19:21:04 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> Message-ID: On Wed, Aug 19, 2015 at 1:25 PM, Guido van Rossum wrote: > Rather than prolonging the debate, let me just reject PEP 500. That is certainly within your powers, but before you do, let me substantiate my claim about 10x performance penalty. $ python3 -m timeit -s "from datetime import datetime, timezone; t = datetime.now(timezone.utc); l = t.astimezone()" "l - t" 1000000 loops, best of 3: 0.448 usec per loop $ python3 -m timeit -s "from datetime import datetime, timezone; t = datetime.now(); l = t" "l - t" 10000000 loops, best of 3: 0.0444 usec per loop Currently, subtracting aware datetimes with the same tzinfo incurs no penalty $ python3 -m timeit -s "from datetime import datetime, timezone; t = datetime.now(timezone.utc); l = t" "l - t" 10000000 loops, best of 3: 0.0436 usec per loop With PEP 500, we can have timeline arithmetic without the overhead of two round trips to tzinfo. I will be happy if someone would demonstrate how to achieve the same by simpler means. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Thu Aug 20 01:32:57 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Aug 2015 19:32:57 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: <55D50C84.1080005@stoneleaf.us> References: <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> <55D50C84.1080005@stoneleaf.us> Message-ID: On Wed, Aug 19, 2015 at 7:08 PM, Ethan Furman wrote: > On 08/19/2015 03:30 PM, Alexander Belopolsky wrote: > > If we all understand that '2015-03-08T12:00:00-05:00', >> '2015-03-08T13:00:00-04:00' >> and '2015-03-09T04:00:00+11:00' are different spellings of the same >> time, where is >> a bug in the following calculation? >> >> >>> print(lt) >> 2015-03-07 12:00:00-05:00 >> >>> lt += timedelta(1) >> >>> print(lt) >> 2015-03-08 12:00:00-05:00 >> > > Well, let's say I live in New York, so all winter long I've been seeing > things like "2015-01-17 9:37:51-05:00", etc. and then the time switches in > March and I fail to notice that the "-05:00" is still "-05:00" and not > "-04:00" -- especially since my watch, clock, smart phone, etc., don't > display the offset -- well, something bad will happen: exactly what depends > on what the user was expecting when adding a day. In this case, add .astimezone() call before printing the time in your program. What can I say, if you want to see the result of 1 + 1 in binary -- call bin() -- Python's "+" won't do it for you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Aug 20 02:21:12 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 19 Aug 2015 17:21:12 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> Message-ID: <-6785188876274532450@unknownmsgid> where is a bug in the following calculation? >>> print(lt) 2015-03-07 12:00:00-05:00 >>> lt += timedelta(1) >>> print(lt) 2015-03-08 12:00:00-05:00 um, nowhere? -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Thu Aug 20 02:16:00 2015 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 19 Aug 2015 19:16:00 -0500 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> <55D50C84.1080005@stoneleaf.us> Message-ID: I propose to end the argument about what "time zone" means by insisting it means whatever I think it means, and that's the end of it ;-) Specifically, a time zone is a function mapping UTC calendar notation to another (possibly identical) calendar notation. For some purposes, the destination's idea of "calendar notation" includes strings like "America/Chicago" or "CDT"/"CST", and in others it may only include a notion of a fixed UTC offset (like "-05:00"). All are valid "time zones", and, e.g., someone insisting that their idea of calendar notation must magically include a string mnemonically distinguishing daylight from standard time is on ground just as solid as anyone else. Telling them they "shouldn't" insist on that won't work. Showing them it's easily obtained by other means also won't work. Unless, of course, their idea of "time zone" is "how civil time actually works" ;-) From alexander.belopolsky at gmail.com Thu Aug 20 02:34:26 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Aug 2015 20:34:26 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: <-6785188876274532450@unknownmsgid> References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> <-6785188876274532450@unknownmsgid> Message-ID: On Wed, Aug 19, 2015 at 8:21 PM, Chris Barker - NOAA Federal < chris.barker at noaa.gov> wrote: > where is a bug in the following calculation? > > > >>> print(lt) > 2015-03-07 12:00:00-05:00 > >>> lt += timedelta(1) > >>> print(lt) > 2015-03-08 12:00:00-05:00 > > > um, nowhere? > https://mail.python.org/pipermail/datetime-sig/2015-August/000351.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Thu Aug 20 02:34:32 2015 From: chris.barker at noaa.gov (Chris Barker - NOAA Federal) Date: Wed, 19 Aug 2015 17:34:32 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: <55D50C84.1080005@stoneleaf.us> References: <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> <55D50C84.1080005@stoneleaf.us> Message-ID: <-5361061231930960305@unknownmsgid> where is >> a bug in the following calculation? >> >> >>> print(lt) >> 2015-03-07 12:00:00-05:00 >> >>> lt += timedelta(1) >> >>> print(lt) >> 2015-03-08 12:00:00-05:00 > > Well, let's say I live in New York, so all winter long I've been seeing things like "2015-01-17 9:37:51-05:00", etc. and then the time switches in March and I fail to notice that the "-05:00" is still "-05:00" and not "-04:00" -- especially since my watch, clock, smart phone, etc., don't display the offset -- well, something bad will happen: If you expect to see local time, you probably aren't looking at ISO timestamps. > exactly what depends on what the user was expecting when adding a day. Not really -- that distinction is about Duration vs Period arithmetic. This is about the fact that a tzinfo object that represents an offset simply has no idea about DST transitions -- it CAN'T do the period arithmetic some people want. In your use case -- don't use an offset. Period, end of story. If you want Period arithmetic, use full featured tzinfo objects. ( and dateutils ). And don't expect to be able to round-trip via an ISO 8601 string. -Chris From tim.peters at gmail.com Thu Aug 20 02:46:00 2015 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 19 Aug 2015 19:46:00 -0500 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> Message-ID: [Alexander Belopolsky] > /// let me substantiate my claim about 10x performance penalty. > > $ python3 -m timeit -s "from datetime import datetime, timezone; t = datetime.now(timezone.utc); l = t.astimezone()" "l - t" > 1000000 loops, best of 3: 0.448 usec per loop > $ python3 -m timeit -s "from datetime import datetime, timezone; t = datetime.now(); l = t" "l - t" > 10000000 loops, best of 3: 0.0444 usec per loop > > Currently, subtracting aware datetimes with the same tzinfo incurs no > penalty > > $ python3 -m timeit -s "from datetime import datetime, timezone; t = datetime.now(timezone.utc); l = t" "l - t" > 10000000 loops, best of 3: 0.0436 usec per loop > > With PEP 500, we can have timeline arithmetic without the overhead of two > round trips to tzinfo. I will be happy if someone would demonstrate how to > achieve the same by simpler means. In general, I agree timeline arithmetic "belongs in" tzinfo objects. As you've said elsewhere, the tzinfo interface gives general datetime code a microscopically tiny understanding of how a given timezone works; in effect, it can only ask what the total UTC offset and DST offset are at a single microsecond in local time, or what a single microsecond in UTC time looks like in local time. Code outside the tzinfo can't _deduce_ anything more generally applicable from any of that; code inside the tzinfo knows everything about how the timezone works. But in the specific case above, I'd call the specter of slowdowns a QOI (quality of implementation) issue. For eternally-fixed-offset-and-eternally-fixed-timezone-name timezones (including UTC), classic and timeline arithmetic are exactly the same thing. A high-quality wrapping of timezone info should take advantage of that, by _not_ using whatever spelling of "this tzinfo object wants timeline arithmetic" is introduced for such timezones. In which case, the naive arithmetic shortcuts will continue to be used for such timezones. I expect this will be crucial for timezone.utc (should be a requirement), so that code implementing fancier stuff can _rely_ on that doing arithmetic on UTC instances won't fall into the kinds of infinite regress Lennart got buried under, and without needing endless annoying and time-consuming dances to strip and re-attach the tzinfo object (to force classic arithmetic). From alexander.belopolsky at gmail.com Thu Aug 20 03:00:50 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Aug 2015 21:00:50 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> <55D50C84.1080005@stoneleaf.us> Message-ID: On Wed, Aug 19, 2015 at 8:16 PM, Tim Peters wrote: > I propose to end the argument about what "time zone" means by > insisting it means whatever I think it means, and that's the end of it > ;-) > +1 > > Specifically, a time zone is a function mapping UTC calendar notation > to another (possibly identical) calendar notation. > I would not got a far as allowing say a function that swaps minutes and seconds in "time zone" definition, but as long as we agree that datetime.timezone is, umm, a "time zone", I am happy with your definition. > > For some purposes, the destination's idea of "calendar notation" > includes strings like "America/Chicago" or "CDT"/"CST", and in others > it may only include a notion of a fixed UTC offset (like "-05:00"). > I don't think anyone here denies that there is more than fixed UTC offset timezones. It just so happened that this is all that CPython ships in the standard library. > > All are valid "time zones", and, e.g., someone insisting that their > idea of calendar notation must magically include a string mnemonically > distinguishing daylight from standard time is on ground just as solid > as anyone else. > Absolutely right. > > Telling them they "shouldn't" insist on that won't work. Showing them > it's easily obtained by other means also won't work. > > I am more optimistic than you are on that. If "include a string mnemonically distinguishing daylight from standard time" is an actual requirement for the product that they need to ship, I am sure they will appreciate seeing that this can be achieved by adding .astimezone() in a few strategic places. > Unless, of course, their idea of "time zone" is "how civil time > actually works" ;-) > -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Thu Aug 20 03:18:03 2015 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 19 Aug 2015 20:18:03 -0500 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> <55D50C84.1080005@stoneleaf.us> Message-ID: ... [TIm] >> Specifically, a time zone is a function mapping UTC calendar notation >> to another (possibly identical) calendar notation. [Alexander] > I would not got a far as allowing say a function that swaps minutes and > seconds in "time zone" definition, I go that far, but I make no promise about how far I'm willing to go to _support_ insane calendar notations ;-) > but as long as we agree that datetime.timezone is, umm, a "time zone", > I am happy with your definition. Universal bliss :-) >> ... >> All are valid "time zones", and, e.g., someone insisting that their >> idea of calendar notation must magically include a string mnemonically >> distinguishing daylight from standard time is on ground just as solid >> as anyone else. > Absolutely right. >> Telling them they "shouldn't" insist on that won't work. Showing them >> it's easily obtained by other means also won't work. > I am more optimistic than you are on that. Which I'll take as proof that I'm older than you ;-) > If "include a string mnemonically distinguishing daylight from standard > time" is an actual requirement for the product that they need to ship, > I am sure they will appreciate seeing that this can be achieved by > adding .astimezone() in a few strategic places. For Pythons already released, yes, they'll appreciate that - although most people with such a requirement in a serious application is probably already using pytz, where they've already added pytz's "force the magical standard/daylight string switch" dance. I was really talking about a future post-PEP-495 world: nobody is going to be willing to do _any_ dance to get the strings right after "timeline arithmetic" is added. Except for me. Because I won't use timeline arithmetic - I prefer simple named functions for that purpose. They'll work by magic too. From alexander.belopolsky at gmail.com Thu Aug 20 03:59:25 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 19 Aug 2015 21:59:25 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> <55D50C84.1080005@stoneleaf.us> Message-ID: On Wed, Aug 19, 2015 at 9:18 PM, Tim Peters wrote: > > If "include a string mnemonically distinguishing daylight from standard > > time" is an actual requirement for the product that they need to ship, > > I am sure they will appreciate seeing that this can be achieved by > > adding .astimezone() in a few strategic places. > > For Pythons already released, yes, they'll appreciate that - although > most people with such a requirement in a serious application is > probably already using pytz, where they've already added pytz's "force > the magical standard/daylight string switch" dance. Maybe or maybe not. Shipping pytz with your product means becoming a TZ database distributor for your customers. More likely they have an in-house equivalent of .astimezone() on top of time.localtime() or a direct call to system localtime if their code predates us exposing tm_zone and friends in time.localtime(). In this case, they can switch to the now available .astimezone() and have fewer lines of proprietary code to support. -------------- next part -------------- An HTML attachment was scrubbed... URL: From lrekucki at gmail.com Thu Aug 20 07:33:34 2015 From: lrekucki at gmail.com (=?UTF-8?Q?=C5=81ukasz_Rekucki?=) Date: Thu, 20 Aug 2015 07:33:34 +0200 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> Message-ID: On Thursday, August 20, 2015, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > On Wed, Aug 19, 2015 at 4:54 PM, Chris Barker > wrote: > > > > However, the Wikipedia page does say: > > > > "Time zones in ISO 8601 are represented as local time (with the location > unspecified), as UTC, or as an offset from UTC." > > > > but nothing about how one specifies the location -- I have no idea if > there is an ISO 8601 way to specify a location, but I've never seen it -- > wikipedia may mean that it should be specified some other way than embedded > in the string. > > Correct, but specifying the location has nothing to do with the computer > notion of a time zone. The IETF experts working on the Time Zone Data > Distribution standard explain that right in the introduction: > > Note that the term "time zone" does not have the common meaning of a > region of the world at a specific UTC offset, possibly modified by > daylight saving time. > > You have read this totally backwards. The quote says that the time zone can NOT be seen as a simple UTC offset +/-DST (which is a common misconception that can be seen on many "timezone" maps). As an example CET is given - it's not a timezone as it consist of "Europe/Warsaw", "Europe/Berlin", etc. >From datetime implementation perspective Tim's definition is the best, but from perspective of a implementing a timezone database a thing like CET or -0500 is just ambiguous because it means one of many actuall timezones. -- ?ukasz Rekucki -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Thu Aug 20 16:53:17 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 20 Aug 2015 10:53:17 -0400 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> Message-ID: On Thu, Aug 20, 2015 at 1:33 AM, ?ukasz Rekucki wrote: > > You have read this totally backwards. The quote says that the time zone can NOT be seen as a simple UTC offset +/-DST (which is a common misconception that can be seen on many "timezone" maps). As an example CET is given - it's not a timezone as it consist of "Europe/Warsaw", "Europe/Berlin", etc. By your logic, UTC is not a time zone either because it "counsists of" Europe/London, Africa/Casablanca etc. This is a very unorthodox point of view. > > From datetime implementation perspective Tim's definition is the best, but from perspective of a implementing a timezone database a thing like CET or -0500 is just ambiguous because it means one of many actuall timezones. Whatever your definition of "actual timezone" is, if you want to use Python effectively you need to understand the terms as they are used in the library manual. And there we have, for example: "%Z - Time zone name (empty string if the object is naive) - (empty), UTC, EST, CST", [1] "%Z - Time zone name (no characters if no time zone exists), " [2] "tm_zone - abbreviation of timezone name," [3] "time.tzname - A tuple of two strings: the first is the name of the local non-DST timezone, the second is the name of the local DST timezone." [4] and so on. You can insist that CET is not a timezone as much as that 0 is not a number, but as far as Python timekeeping goes, timezone is whatever Tim says it is. (Those who disagree can think of Python timezones as TIMezones. :-) [1]: https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior [2]: https://docs.python.org/3/library/time.html#time.strftime [3]: https://docs.python.org/3/library/time.html#time.struct_time [4]: https://docs.python.org/3/library/time.html#time.tzname -------------- next part -------------- An HTML attachment was scrubbed... URL: From ischwabacher at wisc.edu Thu Aug 20 22:38:53 2015 From: ischwabacher at wisc.edu (ISAAC J SCHWABACHER) Date: Thu, 20 Aug 2015 20:38:53 +0000 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> Message-ID: The problem with abbreviations like "CET" isn't that they don't represent time zones; the problem is that they're not unique. CST is probably Central Standard Time... unless it's China Standard Time or Cuba Standard Time. There are plenty of other examples. We can't stop people from using these, and we shouldn't refuse to support them out of spite, but if we present to users one obvious way to do it that forces us to guess in the face of this ambiguity, I'd argue that that's bad design. ijs Top-posted from Microsoft Outlook Web App; may its designers be consigned for eternity to that circle of hell in which their dog food is consumed. ________________________________ From: Datetime-SIG on behalf of Alexander Belopolsky Sent: Thursday, August 20, 2015 09:53 To: ?ukasz Rekucki Cc: datetime-sig Subject: Re: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... On Thu, Aug 20, 2015 at 1:33 AM, ?ukasz Rekucki > wrote: > > You have read this totally backwards. The quote says that the time zone can NOT be seen as a simple UTC offset +/-DST (which is a common misconception that can be seen on many "timezone" maps). As an example CET is given - it's not a timezone as it consist of "Europe/Warsaw", "Europe/Berlin", etc. By your logic, UTC is not a time zone either because it "counsists of" Europe/London, Africa/Casablanca etc. This is a very unorthodox point of view. > > From datetime implementation perspective Tim's definition is the best, but from perspective of a implementing a timezone database a thing like CET or -0500 is just ambiguous because it means one of many actuall timezones. Whatever your definition of "actual timezone" is, if you want to use Python effectively you need to understand the terms as they are used in the library manual. And there we have, for example: "%Z - Time zone name (empty string if the object is naive) - (empty), UTC, EST, CST", [1] "%Z - Time zone name (no characters if no time zone exists), " [2] "tm_zone - abbreviation of timezone name," [3] "time.tzname - A tuple of two strings: the first is the name of the local non-DST timezone, the second is the name of the local DST timezone." [4] and so on. You can insist that CET is not a timezone as much as that 0 is not a number, but as far as Python timekeeping goes, timezone is whatever Tim says it is. (Those who disagree can think of Python timezones as TIMezones. :-) [1]: https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior [2]: https://docs.python.org/3/library/time.html#time.strftime [3]: https://docs.python.org/3/library/time.html#time.struct_time [4]: https://docs.python.org/3/library/time.html#time.tzname -------------- next part -------------- An HTML attachment was scrubbed... URL: From stuart at stuartbishop.net Fri Aug 21 14:07:58 2015 From: stuart at stuartbishop.net (Stuart Bishop) Date: Fri, 21 Aug 2015 19:07:58 +0700 Subject: [Datetime-SIG] PEP-431/495 Message-ID: Sorry I'm late. pytz author here. Gosh you guys write a lot. I've tried to skim things, and will default to agreeing with Tim since it is usually the smart thing to do. A few notes from my skimming: - I want a boolean added to datetime instances, even if I don't like the name, because I can then deprecate pytz and its confusing API and implementation. I'm happy to work on Python implementation and documentation. It will save me time and effort in the long run. - Most of my thoughts got encoded in PEP-431. This would give us a datetime module that operates exactly the way it does today, but with the option of performing pytz style unambiguous datetime arithmetic without pytz and its confusing API. If the developer explicity set the is_dst flag, then exceptions would be raised when trying to instantiate an ambiguous or invalid timestamp. For code that does not specify the new, optional flag things work as they do today and a best guess made when the localized datetime is constructed. - PEP-495 seems similar to PEP-431, except that it attempts to allow things continue in the face of an ambiguous or invalid localized datetime. The boolean flag is not tristate, so there is no way to have strict checking of input. It doesn't matter if the developer said 'whatever' and left the flag on the default, or cared enough to explicitly override it. - The rules in PEP-495 for utcoffset() and dst() to deal with ambiguous times only work in simple cases, as there dst offsets both more and less than 1 hour, and there is no stdoffset since the offset can change at the same time (eg. Europe/Vilnius 1941, where the clocks ended up going backwards for summer time instead of forwards). - Other APIs I know of, including Python's time module, uses is_dst or isdst as the required boolean flag. As do the timezone databases containing the data we need. I think the argument against the is_dst flag name in PEP-495 is flaccid. - If there is an argument in favour of 'first' over 'is_dst', it is because occasionally there are timezone changes without a dst transition. If we call it is_dst, we agree that in a few rare historical cases we are going to have to lie. - My argument in favour of 'is_dst' over 'first' is that this is what we have in the data we are trying to load. You commonly have a timestamp with a timezone abbreviation and/or offset. This can easily be converted to an is_dst flag. To convert it to a 'first' flag, we need to first parse the datetime, determine the transition points that year, and then which side of the nearest transition point it lies. Note that there can be more than 2 transition points in a year, and no api has been discussed for discovering them. - I think datetime should consider 1 day == 24 hours and not have concepts like years or months, just like it does today. As others suggested, a separate module dealing with leap years and variable length days may be useful to some people, as would leapsecond support for astronomers and astrologers. But if the default implementation gives different results to all the other tools on your system, people will think the default is wrong. - Offsets should ideally be declared in seconds. Last I looked, the current Python implementation rounds them to the nearest minute and it would be nice to fix that. These are almost always historical, dating from when noon was when the sun was at its highest point above the capital (eg. Europe/Amsterdam before 1938) - There are cases where there are gaps at the end of DST, and folds at the beginning of DST, when the timezone offsets were changed simultaneously with the dst flag. - Microsoft's timezone database does not contain historical information, which is why databases that need support under Windows like PostgreSQL include the IANA/Olson database. - Thank you to everyone who has been working on this. I've wanted it for a long, long time but never got around to remembering how to write C. -- Stuart Bishop http://www.stuartbishop.net/ From alexander.belopolsky at gmail.com Fri Aug 21 18:59:08 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 21 Aug 2015 12:59:08 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On Fri, Aug 21, 2015 at 8:07 AM, Stuart Bishop wrote: > - The rules in PEP-495 for utcoffset() and dst() to deal with > ambiguous times only work in simple cases, as there dst offsets both > more and less than 1 hour, and there is no stdoffset since the offset > can change at the same time (eg. Europe/Vilnius 1941, where the clocks > ended up going backwards for summer time instead of forwards). > Instead of engaging in a theoretical discussion, I went ahead and added this transition as a test case to my reference implementation. Please review [1] and let me know if you see any issues. [1]: https://github.com/abalkin/cpython/commit/9f683c8d0f6f2b48aad81ae4e5e8a118a542d2d4 -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Fri Aug 21 22:49:13 2015 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 21 Aug 2015 15:49:13 -0500 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: [Stuart Bishop ] > Sorry I'm late. pytz author here. Hi, Stuart! Nice to see you. Stay a while :-) > Gosh you guys write a lot. I've tried to skim things, and will default > to agreeing with Tim since it is usually the smart thing to do. Excellent judgment. Although agreeing with Guido is mandatory, and he's wrong about some things here ;-) > A few notes from my skimming: > > - I want a boolean added to datetime instances, even if I don't like > the name, because I can then deprecate pytz and its confusing API and > implementation. I'm happy to work on Python implementation and > documentation. It will save me time and effort in the long run. Later you seem to say you'd prefer a 3-state flag instead, so not sure you really mean "boolean" here. > - Most of my thoughts got encoded in PEP-431. This would give us a > datetime module that operates exactly the way it does today, No. While 431 was highly obscure on this point, it turned out that Lennart was determined to change arithmetic behavior. That can't fly, for backward compatibility, and because even "aware" datetimes were intended to use a "naive time" model internally. Specifically, if you add timedelta(days=1) to a datetime today, you get "same time tomorrow" (day goes up by 1, but hour, minute, second and microsecond remain the same) in all cases. Even if a DST transition (or base-offset change, or leap-second change) occurred. That's now called "classic" arithmetic. The default behavior can't be changed. What you seem to have in mind (accounting for two of the three known reasons for why a local clock may jump: DST and base-offset changes, but not leap second changes) is now called "timeline" (sometimes "strict") arithmetic. According to Lennart, under PEP 431 timeline arithmetic would always be used. Under PEP 495, nothing about arithmetic changes. 495 is less ambitious, only intending to supply the bit(s) needed to _allow_ timeline arithmetic to be implemented as an option later. PEP 500 is about supplying different arithmetics, but Guido hates PEP 500. In the end, I expect timezone wrappers will supply factory functions, either separate functions for "give me such-and-such a timezone using classic arithmetic" and "give me such-and-such a timezone using timeline arithmetic", or a single function specifying the desired timezone and an optional flag to specify the arithmetic desired. > but with the option of performing pytz style unambiguous datetime > arithmetic There was nothing optional about it in 431, 495 doesn't address arithmetic, except to make it _possible_ to implement timeline arithmetic. > without pytz and its confusing API. > If the developer explicity set the is_dst flag, then exceptions would > be raised when trying to instantiate an ambiguous or invalid timestamp. > For code that does not specify the new, optional flag things work as > they do today and a best guess made when the localized datetime is > constructed. It's possible that 495 should do more in this direction. For now, it specifies enough that someone who cares can easily write a function to distinguish among "ambiguous time (in a fold)", "invalid time" (in a gap), and "happy time" ;-) , and do whatever _they_ want (ignore some subset, raise an exception, print a warning, supply a default, prompt the user for more info, ...). > - PEP-495 seems similar to PEP-431, See above. 431 was about arithmetic, although it didn't say so clearly. 495 is _only_ about adding a flag. > except that it attempts to allow things continue in the face of > an ambiguous or invalid localized datetime. > > The boolean flag is not tristate, so there is no way to have > strict checking of input. It doesn't matter if the developer said > 'whatever' and left the flag on the default, or cared enough to > explicitly override it. As above, it's possible 495 should do more. But it's hard to know when to stop. For example, there are many ways of specifying a datetime, including. e.g., using .combine() to paste a date and time together. It's generally impossible to make a fold/gap determination on a time alone - that's only possible in combination with a date. So does .combine() also need to whine? It's simpler overall to leave it to those users who care to check when they do care. > - The rules in PEP-495 for utcoffset() and dst() to deal with > ambiguous times only work in simple cases, as there dst offsets both > more and less than 1 hour, and there is no stdoffset since the offset > can change at the same time (eg. Europe/Vilnius 1941, where the clocks > ended up going backwards for summer time instead of forwards). 495 couldn't care less what causes folds and gaps - it's equally applicable to all causes, and whether in isolation or combination. What it _does_ assume is that a single bit suffices to resolve ambiguities: that there is no case in which more than two UTC times have the same spelling on a local clock. The goal of the PEP is to supply that bit. The burden is on the tzinfo supplier to set and use it correctly. The burden is also on the tzinfo supplier to supply a .utcoffset() "that works" to convert a local time to UTC, to supply a .dst() that returns whatever the tzinfo supplier thinks it should return, and to supply a .fromutc() that sets the bit correctly. The default .fromutc() is indeed too weak to handle anything except zones subject to nothing fancier than DST transitions alternating between "zero" and "non-zero", and that's not changing either. Neither will the default .fromutc() be changed to set first/fold/later/is_dst - only a tzinfo implementer has enough info about how the timezone works to set the bit correctly and semi-efficiently in all cases (the default .fromutc() can only ask what the total UTC, and DST, offsets are at specific microseconds in local time - it has no knowledge deeper than that, because those are the only questions the tzinfo interface _can_ be asked). As to "more and less than 1 hour", yes, the PEP hasn't been updated to clarify that "hour" _means_ "some number of microseconds" ;-) > - Other APIs I know of, including Python's time module, uses is_dst or > isdst as the required boolean flag. As do the timezone databases > containing the data we need. I think the argument against the is_dst > flag name in PEP-495 is flaccid. is_dst makes no sense for base-offset or leap-second transitions either; "first"/"fold"/"later" make equally clear sense for all causes of folds. But Guido hates leap seconds, seemingly intending to make it impossible for anyone to support them directly (via overloading datetime arithmetic operators), and so the case against "is_dst" is weaker now. > - If there is an argument in favour of 'first' over 'is_dst', it is > because occasionally there are timezone changes without a dst > transition. If we call it is_dst, we agree that in a few rare > historical cases we are going to have to lie. There are only two tzinfo authors in the world ;-) (you and Gustavo), and by all evidence you're both way more than bright enough to adapt to any spelling ;-) > - My argument in favour of 'is_dst' over 'first' is that this is what > we have in the data we are trying to load. You commonly have .> a timestamp with a timezone abbreviation and/or offset. This can > easily be converted to an is_dst flag. You mean by using platform C library functions (albeit perhaps wrapped by Python)? > To convert it to a 'first' flag, we need to first parse the datetime, I'm unclear on this. To get a datetime _at all_ the timestamp has to be converted to calendar notation (year, month, ...). Which is what I'm guessing "parse" means here. That much has to be done in any case. > determine the transition points that year, and then which side of > the nearest transition point it lies. Note that there can be more > than 2 transition points in a year, and no api has been discussed for > discovering them. Python doesn't need such an API. It needs the tzinfo author to implement .utcoffset(), .dst(), and .fromutc() according to whatever rules a timezone requires. Python code would convert the timestamp to UTC calendar notation first, then use .astimezone() to convert to whatever "timezone abbreviation and/or offset" was specified. astimezone() in turn gets everything it needs from the tzinfo's .fromutc(). I'm unclear anyway on why you'd trust an external is_dst flag to be reliable in the funky cases where, e.g., base-offset and DST transitions coincide. You either think it's important to handle such cases or you don't. If you do, what do _you_ think tm_isdst means in such cases? If you're relying on external code to compute is_dst for you, then it doesn't matter what anyone in the Python world thinks it should mean. It only matters what the universe of C library authors thought it should mean, assuming they were even aware of such cases. The relevant standards are no help at all in such edge cases. The web is filled with complaints about puzzling tm_isdst behavior in edge cases, and no two implementations seem to agree on what -1 "really means" even in seemingly straightforward cases. I'd rather that Python tzinfo authors implement exactly what _they_ think a timezone's rules really are - which indeed requires analyzing a time using all the timezone's internal rules. > - I think datetime should consider 1 day == 24 hours and not have > concepts like years or months, just like it does today. As others > suggested, a separate module dealing with leap years and variable > length days may be useful to some people, as would leapsecond support > for astronomers and astrologers. But if the default implementation > gives different results to all the other tools on your system, people > will think the default is wrong. Not sure what you mean here without specific examples of what you have in mind. But, as above, classic arithmetic will remain the default regardless - it's a dozen years too late to change that, even if everyone wanted to (and - surprise - everyone doesn't ;-) ). > - Offsets should ideally be declared in seconds. Last I looked, the > current Python implementation rounds them to the nearest minute and it > would be nice to fix that. These are almost always historical, dating > from when noon was when the sun was at its highest point above the > capital (eg. Europe/Amsterdam before 1938) Offsets are currently required to be a multiple of a minute (no rounding is done - an exception is raised if an offset is not a multiple of a minute, with magnitude less than 24*60 (the number of minutes in a day)). That should change, and Alexander has already done most of the work for it, but it's not in the scope of this PEP. "The flag" can be added with or without that change. > - There are cases where there are gaps at the end of DST, and folds at > the beginning of DST, when the timezone offsets were changed > simultaneously with the dst flag. That's fine, provided again that a single bit suffices to resolve ambiguous times on the local clock. A fold is a fold and a gap is a gap, regardless of cause. It's only if we, e.g., _name_ the flag "is_dst" that someone is likely to erroneously assume that the flag always _means_ "and so there's a fold when it changes from True to False, and a gap when it changes from False to True". > - Microsoft's timezone database does not contain historical > information, which is why databases that need support under Windows > like PostgreSQL include the IANA/Olson database. > > - Thank you to everyone who has been working on this. I've wanted it > for a long, long time but never got around to remembering how to write > C. Au contraire - thank _you_ for pytz! That was such an heroic effort to overcome the lack of a bit that it's legendary :-) We'll get this all to work cleanly in the end. From alexander.belopolsky at gmail.com Fri Aug 21 23:51:46 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 21 Aug 2015 17:51:46 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On Fri, Aug 21, 2015 at 4:49 PM, Tim Peters wrote: > As to "more and less than 1 hour", yes, the PEP hasn't been updated to > clarify that "hour" _means_ "some number of microseconds" ;-) > Thanks for the reminder! I updated [1] the draft on Github. I will try to get a new version published at the peps site this weekend. [1]: https://github.com/abalkin/ltdf/commit/33184bbcfca84c94081d4a291bbcd3440ed25b6c -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sat Aug 22 01:33:46 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 21 Aug 2015 19:33:46 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On Fri, Aug 21, 2015 at 4:49 PM, Tim Peters wrote: > > - Most of my thoughts got encoded in PEP-431. This would give us a > > datetime module that operates exactly the way it does today, > > No. While 431 was highly obscure on this point, it turned out that > Lennart was determined to change arithmetic behavior. More importantly, PEP 431 did not propose adding any additional state to datetime instances. The proposal [1] was to change signatures of the three tzinfo methods and that of datetime.astimezone. I don't think with PEP 431 alone, the following behavior specified [2] in PEP 495 is possible: >>> dt1.strftime('%D %T %Z%z') '11/02/14 01:30:00 EDT-0400' >>> dt2.strftime('%D %T %Z%z') '11/02/14 01:30:00 EST-0500' [1]: https://www.python.org/dev/peps/pep-0431/#new-parameter-is-dst [2]: https://www.python.org/dev/peps/pep-0495/#conversion-from-naive-to-aware -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Aug 22 02:48:04 2015 From: guido at python.org (Guido van Rossum) Date: Fri, 21 Aug 2015 17:48:04 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> Message-ID: I lost interest in this thread, but PEP 500 is still rejected. :-) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sat Aug 22 20:55:50 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 22 Aug 2015 13:55:50 -0500 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: [Stuart Bishop] >> ... >> To convert it to a 'first' flag, we need to first parse the datetime, >> determine the transition points that year, and then which side of >> the nearest transition point it lies. Note that there can be more >> than 2 transition points in a year, and no api has been discussed for >> discovering them. [TIm] > Python doesn't need such an API. Let me be clearer about this. I appreciate that Olson-general timezones are a PITA to implement both compactly and efficiently. tzinfo internals may want all sorts of internal "helper APIs" to ease that burden. The alternative is asking Alexander "and what about this year?" for each annoying case, and waiting for him to write a class implementing the rules for just the specific year in question ;-) Hammering that stuff out is certainly appropriate on this list - it's just not in the scope of PEP 495. That's about specifying visible results (requirements on tzinfo implementations), and adding a user-visible flag requires a PEP. tzinfo internals don't require approval from anyone. In that respect, PEP 500 would have allowed each tzinfo class to implement arithmetic in the most efficient way sufficient for the timezone it represents. But that's been killed. What we're left with appears to be just one other bit, spelled via the presence or absence of a new magic attribute on a tzinfo instance, or via inheritance or non-inheritance from a new marker class. That new bit is used to spell "classic or timeline arithmetic?" (which datetime internals - not tzinfos - will be required to implement). That does allow for one major zone-dependent optimization: if a timezone has a single UTC offset and a single name for all eternity (timezone.utc being the most important example), then .utcoffset(), .dst(), .tzname() and .fromutc() are one-liners, and classic and timeline arithmetic are the same thing. So such zones can (& "should") use a single dirt-simple "I want classic arithmetic" class regardless of the arithmetic bit (for simplicity and efficiency: gaps & folds don't exist in such zones, and classic arithmetic goes much faster than timeline arithmetic). For internal purposes, it _may_ be that a tzinfo wrapping would like to make similar distinctions among other kinds of timezones. For example, it it sees that it's wrapping a zone that's only ever had one base UTC offset, and a single DST rule, it could be modeled by an instance of a tzinfo subclass with code customized to take advantage of those regularities. But all of that is out of scope for 495 too. From alexander.belopolsky at gmail.com Sat Aug 22 21:15:20 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 22 Aug 2015 15:15:20 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On Sat, Aug 22, 2015 at 2:55 PM, Tim Peters wrote: > [TIm] > > Python doesn't need such an [time zone transition point discovery] API. > > Let me be clearer about this. I appreciate that Olson-general > timezones are a PITA to implement both compactly and efficiently. > tzinfo internals may want all sorts of internal "helper APIs" to ease > that burden. The alternative is asking Alexander "and what about this > year?" for each annoying case, and waiting for him to write a class > implementing the rules for just the specific year in question ;-) > I don't really mind. I enjoy learning about exotic places and the timekeeping practices there. (The secret mission behind the killed PEP 500 was to implement the Martian Time.:-) > Hammering that stuff out is certainly appropriate on this list - it's > just not in the scope of PEP 495. That's about specifying visible > results (requirements on tzinfo implementations), and adding a > user-visible flag requires a PEP. tzinfo internals don't require > approval from anyone. > Please keep hammering. If we add an additional member to the datetime objects, I want the new API to be good for the next 20 years and not just wet the appetite for adding more and more with every Python release. For example, if anyone can find a place on Earth where 01:30 AM was repeated 3 times in one day - I will be all for changing ltdf type from boolean to integer. Not because I think it is important to support that one exotic event, but because if that was done once somewhere it will certainly be done again somewhere else. -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Sat Aug 22 23:35:39 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Sat, 22 Aug 2015 14:35:39 -0700 Subject: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) Was: PEP 495 ... is ready ... In-Reply-To: References: <55D34397.9030502@stoneleaf.us> <55D4A30D.30909@oddbird.net> <55D4E74F.7050501@oddbird.net> Message-ID: On Thu, Aug 20, 2015 at 1:38 PM, ISAAC J SCHWABACHER wrote: > The problem with abbreviations like "CET" isn't that they don't represent > time zones; the problem is that they're not unique. > exactly -- and I don't know what vocabulary we should use, but maybe a nice recursive definition for this discussion: A timezone is the thing that a tzinfo object represents" would do it. but then I don't know what words to use to describe the other "things" we need to talk about... > CST is probably Central Standard Time... unless it's China Standard Time > or Cuba Standard Time. There are plenty of other examples. We can't stop > people from using these, and we shouldn't refuse to support them out of > spite, but if we present to users one obvious way to do it that forces us > to guess in the face of this ambiguity, I'd argue that that's bad design. > Is anyone pr0posing that? I know I find notations like "EST" helpful in daily life: This memo was written at "5:27pm EDT" It's clear to an American, anyway, what that means (and I"m from Seattle, but happen to be in Boston now, so it's actually kind of helpful :-) ) But those notations are strictly one way streets -- not very useful if you want to construct an aware datetime instance from them -- at least without more info. Do pytz or datetutils let you construct a tzinfo instance from "EST" or the like? (if the answer is no -- no need to drag the thread out...) -CHB > > ijs > > > Top-posted from Microsoft Outlook Web App; may its designers be consigned > for eternity to that circle of hell in which their dog food is consumed. > ------------------------------ > *From:* Datetime-SIG wisc.edu at python.org> on behalf of Alexander Belopolsky < > alexander.belopolsky at gmail.com> > *Sent:* Thursday, August 20, 2015 09:53 > *To:* ?ukasz Rekucki > *Cc:* datetime-sig > *Subject:* Re: [Datetime-SIG] PEP-0500 (Alternative datetime arithmetic) > Was: PEP 495 ... is ready ... > > > On Thu, Aug 20, 2015 at 1:33 AM, ?ukasz Rekucki > wrote: > > > > You have read this totally backwards. The quote says that the time zone > can NOT be seen as a simple UTC offset +/-DST (which is a common > misconception that can be seen on many "timezone" maps). As an example CET > is given - it's not a timezone as it consist of "Europe/Warsaw", > "Europe/Berlin", etc. > > > By your logic, UTC is not a time zone either because it "counsists of" > Europe/London, Africa/Casablanca etc. This is a very unorthodox point of > view. > > > > > From datetime implementation perspective Tim's definition is the best, > but from perspective of a implementing a timezone database a thing like CET > or -0500 is just ambiguous because it means one of many actuall timezones. > > > Whatever your definition of "actual timezone" is, if you want to use > Python effectively you need to understand the terms as they are used in the > library manual. And there we have, for example: "%Z - Time zone name > (empty string if the object is naive) - (empty), UTC, EST, CST", [1] "%Z - > Time zone name (no characters if no time zone exists), " [2] "tm_zone - > abbreviation of timezone name," [3] "time.tzname - A tuple of two strings: > the first is the name of the local non-DST timezone, the second is the name > of the local DST timezone." [4] and so on. > > You can insist that CET is not a timezone as much as that 0 is not a > number, but as far as Python timekeeping goes, timezone is whatever Tim > says it is. (Those who disagree can think of Python timezones as > TIMezones. :-) > > > [1]: > https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior > [2]: https://docs.python.org/3/library/time.html#time.strftime > [3]: https://docs.python.org/3/library/time.html#time.struct_time > [4]: https://docs.python.org/3/library/time.html#time.tzname > > _______________________________________________ > Datetime-SIG mailing list > Datetime-SIG at python.org > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From chris.barker at noaa.gov Sat Aug 22 23:46:43 2015 From: chris.barker at noaa.gov (Chris Barker) Date: Sat, 22 Aug 2015 14:46:43 -0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On Sat, Aug 22, 2015 at 11:55 AM, Tim Peters wrote: > What we're left with > appears to be just one other bit, spelled via the presence or absence > of a new magic attribute on a tzinfo instance, or via inheritance or > non-inheritance from a new marker class. That new bit is used to > spell "classic or timeline arithmetic?" (which datetime internals - > not tzinfos - will be required to implement). > if it would be implemented by datetime, or a datetime subclass, wouldn't it make sense for that attribute to be on the datetime instance, rather than a tzinfo instance? Anyway, I have an interest in seeing that done -- and Alexander is right, it would be better to make whatever changes we need to datetime now in a way that will last. so is it time to add an attribute to specify "timeline" arithmetic somewhere now? BTW, as I read it, PEP 431's biggest contribution was to bring access to a timezone database into the stdlib -- is that idea dead? -Chris > gaps > & folds don't exist in such zones, and classic arithmetic goes much > faster than timeline arithmetic). > much faster? isn't it "just a "convert to UTC, do the math, convert back to the TZ?" and for the "simple" TZs, the convert is an addition or subtraction. Is it worth optimizing that out? Though I suppose that if the only thing you need to do to optimize it is for the tzinfo class to be responsible for that decision, then maybe that is a fine argument for putting that decision there. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sat Aug 22 23:53:38 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 22 Aug 2015 17:53:38 -0400 Subject: [Datetime-SIG] Is EDT a timezone? Was: PEP-0500 Message-ID: On Sat, Aug 22, 2015 at 5:35 PM, Chris Barker wrote: > Do pytz or datetutils let you construct a tzinfo instance from "EST" or > the like? > > (if the answer is no -- no need to drag the thread out...) You can discuss this on pytz or dateutils mailing lists. The issue that I find relevant for this group is the question I now replaced the subject with: Is EDT a timezone? The answer provided by python 3.3 and later is unequivocal "yes": >>> from datetime import * >>> u = datetime.now(timezone.utc) >>> t = u.astimezone() >>> t.tzname() 'EDT' >>> isinstance(t.tzinfo, timezone) True >>> print(t.tzinfo) EDT Some people on this list claimed that the following behavior is a bug: >>> (t + timedelta(100)).strftime('%F %T %Z%z') '2015-11-30 17:45:51 EDT-0400' because the correct result should be '2015-11-30 16:45:51 EST-0500'. My answer to that is that if you need that result, you can get it, but you have to ask for it explicitly: >>> (t + timedelta(100)).astimezone().strftime('%F %T %Z%z') '2015-11-30 16:45:51 EST-0500' I don't think we can do much here other than to educate Python users. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sun Aug 23 00:08:32 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 22 Aug 2015 18:08:32 -0400 Subject: [Datetime-SIG] Timezone database Was: PEP-431/.. Message-ID: I changed the subject and removed PEP-495 from it because it is beyond the scope of PEP-495. On Sat, Aug 22, 2015 at 5:46 PM, Chris Barker wrote: > BTW, as I read it, PEP 431's biggest contribution was to bring access to a > timezone database into the stdlib -- is that idea dead? I don't think it is dead, but I think it is premature. There are plenty of improvements that we can bring to the datetime module while staying within the confines of POSIX interface to the system time zones. Along the way, we will certainly make the lives of tzinfo providers easier and hopefully one of the packages such as pytz will get to the point where PEP-431-bis will consist of one line: "Let's accept pytz into Python standard library." -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun Aug 23 00:11:16 2015 From: guido at python.org (Guido van Rossum) Date: Sat, 22 Aug 2015 15:11:16 -0700 Subject: [Datetime-SIG] Is EDT a timezone? Was: PEP-0500 In-Reply-To: References: Message-ID: On Sat, Aug 22, 2015 at 2:53 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > Some people on this list claimed that the following behavior is a bug: > > >>> (t + timedelta(100)).strftime('%F %T %Z%z') > '2015-11-30 17:45:51 EDT-0400' > > because the correct result should be '2015-11-30 16:45:51 EST-0500'. > > My answer to that is that if you need that result, you can get it, but you > have to ask for it explicitly: > > >>> (t + timedelta(100)).astimezone().strftime('%F %T %Z%z') > '2015-11-30 16:45:51 EST-0500' > > I don't think we can do much here other than to educate Python users. > It is disappointing that you still believe this, because the intention of introducing DST-aware tzinfo objects was to be able to get the latter answer. The trick that pytz uses to obtain timeline arithmetic causes it to use only fixed-offset tzinfo objects. It's true that some platforms may not give enough information about the local timezone to do better -- but others do. There is no requirement that tzname is unique to determine all the properties of a timezone. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sun Aug 23 00:17:39 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 22 Aug 2015 18:17:39 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On Sat, Aug 22, 2015 at 5:46 PM, Chris Barker wrote: > > gaps > >> & folds don't exist in such zones, and classic arithmetic goes much >> faster than timeline arithmetic). >> > > much faster? isn't it "just a "convert to UTC, do the math, convert back > to the TZ?" and for the "simple" TZs, the convert is an addition or > subtraction. Is it worth optimizing that out? > You under-appreciate how well-optimized datetime arithmetic is in CPython. Adding/subtracting datetime objects is just 1.5-2x slower than adding/subtracting integers. Compared to that, a single call to .utcoffset() ("even if the function itself is implemented in C") is often unacceptable overhead. Note that there are many programs written already that use datetime arithmetic extensively. We cannot possibly make them run 10x slower in the next Python release. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sun Aug 23 00:28:32 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 22 Aug 2015 17:28:32 -0500 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: [Tim] >> What we're left with >> appears to be just one other bit, spelled via the presence or absence >> of a new magic attribute on a tzinfo instance, or via inheritance or >> non-inheritance from a new marker class. That new bit is used to >> spell "classic or timeline arithmetic?" (which datetime internals - >> not tzinfos - will be required to implement). [Chris Barker ] > if it would be implemented by datetime, or a datetime subclass, wouldn't it > make sense for that attribute to be on the datetime instance, rather than a > tzinfo instance? Which is why arithmetic "belongs in" tzinfo too - how the local clock acts is entirely about the timezone. But, given that we're not going to do it that way, Guido already answered (to you!) why it's better to put the arithmetic flag in a tzinfo.: datetimes are created all over the place and by all kinds of operations, but a program typically only has a few places where it invokes a factory function to get a tzinfo instance. It's not credible that people will want to switch arithmetic on a datetime-by-datetime basis; it's entirely credible that they want all datetimes with a given tzinfo to use the same kind of arithmetic. So putting the flag on the tzinfo makes what everyone wants dead easy to do, at the cost of making things nobody wants to do harder ;-) > ... > so is it time to add an attribute to specify "timeline" arithmetic somewhere > now? No. And I won't discuss it more. If you want to push it, get Guido to insist on it ;-) > BTW, as I read it, PEP 431's biggest contribution was to bring access to a > timezone database into the stdlib -- is that idea dead? No, but it's not in PEP 495. It's not in any active PEP at the moment. But since Stuart Bishop is here now, and he wrote pytz, I don't expect that to last ;-) >> & folds don't exist in such zones, and classic arithmetic goes much >> faster than timeline arithmetic). > much faster? Yes. At least factor-of-10 faster. Alexander recently posted code showing that, and it's not at all surprising. > isn't it "just a "convert to UTC, do the math, convert back to the TZ?" Which are far more expensive than you might guess. Spend some time staring at "the rules" in the Olson database. They're a god-awful irregular mess. Each conversion has to take into account an arbitrarily long list of historical base-offset changes, then from that find the current DST rules in effect, and from that do irregular calendar calculations like "second Sunday in March" to figure out when DST begins and ends in the current year (and even those may occur more often than _just_ once per year). Alexander got his results from "timeline arithmetic" that was utterly trivial compared to all that, just the extra expense of some doing some semantically useless +/- timedelta(0) operations. I wouldn't be surprised to see timeline arithmetic approaching being 100 times slower for a timezone that's full of historical baggage. > and for the "simple" TZs, the convert is an addition or subtraction. Only if they're implemented to take advantage of the simplifications that are possible. Which only a tzinfo implementer can know. > Is it worth optimizing that out? Not to me, because I have no use for builtin timeline arithmetic at all, But I bet _you'd_ complain about it ;-) > Though I suppose that if the only thing you need to do to optimize it is > for the tzinfo class to be responsible for that decision, then maybe that > is a fine argument for putting that decision there. Only a tzinfo author has any idea how a timezone works at a high level. All the _user_ of a tzinfo can know is the UTC offset at some particular microsecond in local time (& similarly microscopic things of no use for optimization). From alexander.belopolsky at gmail.com Sun Aug 23 00:35:10 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 22 Aug 2015 18:35:10 -0400 Subject: [Datetime-SIG] Is EDT a timezone? Was: PEP-0500 In-Reply-To: References: Message-ID: On Sat, Aug 22, 2015 at 6:11 PM, Guido van Rossum wrote: > On Sat, Aug 22, 2015 at 2:53 PM, Alexander Belopolsky < > alexander.belopolsky at gmail.com> wrote: > >> Some people on this list claimed that the following behavior is a bug: >> >> >>> (t + timedelta(100)).strftime('%F %T %Z%z') >> '2015-11-30 17:45:51 EDT-0400' >> >> because the correct result should be '2015-11-30 16:45:51 EST-0500'. >> >> My answer to that is that if you need that result, you can get it, but >> you have to ask for it explicitly: >> >> >>> (t + timedelta(100)).astimezone().strftime('%F %T %Z%z') >> '2015-11-30 16:45:51 EST-0500' >> >> I don't think we can do much here other than to educate Python users. >> > > It is disappointing that you still believe this, because the intention of > introducing DST-aware tzinfo objects was to be able to get the latter > answer. The trick that pytz uses to obtain timeline arithmetic causes it to > use only fixed-offset tzinfo objects. It's true that some platforms may not > give enough information about the local timezone to do better -- but others > do. There is no requirement that tzname is unique to determine all the > properties of a timezone. > There is nothing here to believe or not to believe. If there is a bug in the datetime module -- we should fix it. If not -- we should educate the users who think the current behavior is a bug. The feature lack of which you may find disappointing is that we don't have the means in the stdlib to get the following: >>> t = datetime.now(LocalZone) >>> t.strftime('%F %T %Z%z') '2015-08-22 17:45:51 EDT-0400' >>> (t + timedelta(100)).strftime('%F %T %Z%z') '2015-11-30 16:45:51 EST-0500' I agree. It would be nice to have something like this (and agree whether it should return 17:45 or 16:45). However, this has nothing to do with having or not having the bug in the library that we ship today. At some point we draw the line saying that the standard datetime module will only implement fixed offset timezones. These timezones were implemented and for the most part correctly. At least in the examples that I provided the code works the way I expect it to work. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sun Aug 23 00:49:30 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 22 Aug 2015 18:49:30 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On Sat, Aug 22, 2015 at 6:28 PM, Tim Peters wrote: > [Chris Barker ] > > if it would be implemented by datetime, or a datetime subclass, wouldn't > it > > make sense for that attribute to be on the datetime instance, rather > than a > > tzinfo instance? > > Which is why arithmetic "belongs in" tzinfo too - how the local clock > acts is entirely about the timezone. > While this is a very logical conclusion, I find it challenging to explain why arithmetic selection flag "belongs" to tzinfo while the "local time disambiguation flag" "belongs" to the datetime instance. It feels backwards: DST is the stuff about timezones while arithmetic is the stuff about datetime. Yet, we have what we have. In an alternative universe, maybe we could have a DateTime metaclass that would produce a separate datetime class for each timezone and then tzinfo would be a class variable rather than instance member. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun Aug 23 01:02:09 2015 From: guido at python.org (Guido van Rossum) Date: Sat, 22 Aug 2015 16:02:09 -0700 Subject: [Datetime-SIG] Is EDT a timezone? Was: PEP-0500 In-Reply-To: References: Message-ID: On Sat, Aug 22, 2015 at 3:35 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > > On Sat, Aug 22, 2015 at 6:11 PM, Guido van Rossum > wrote: > >> On Sat, Aug 22, 2015 at 2:53 PM, Alexander Belopolsky < >> alexander.belopolsky at gmail.com> wrote: >> >>> Some people on this list claimed that the following behavior is a bug: >>> >>> >>> (t + timedelta(100)).strftime('%F %T %Z%z') >>> '2015-11-30 17:45:51 EDT-0400' >>> >>> because the correct result should be '2015-11-30 16:45:51 EST-0500'. >>> >>> My answer to that is that if you need that result, you can get it, but >>> you have to ask for it explicitly: >>> >>> >>> (t + timedelta(100)).astimezone().strftime('%F %T %Z%z') >>> '2015-11-30 16:45:51 EST-0500' >>> >>> I don't think we can do much here other than to educate Python users. >>> >> >> It is disappointing that you still believe this, because the intention of >> introducing DST-aware tzinfo objects was to be able to get the latter >> answer. The trick that pytz uses to obtain timeline arithmetic causes it to >> use only fixed-offset tzinfo objects. It's true that some platforms may not >> give enough information about the local timezone to do better -- but others >> do. There is no requirement that tzname is unique to determine all the >> properties of a timezone. >> > > There is nothing here to believe or not to believe. If there is a bug in > the datetime module -- we should fix it. If not -- we should educate the > users who think the current behavior is a bug. > > The feature lack of which you may find disappointing is that we don't have > the means in the stdlib to get the following: > > >>> t = datetime.now(LocalZone) > >>> t.strftime('%F %T %Z%z') > '2015-08-22 17:45:51 EDT-0400' > >>> (t + timedelta(100)).strftime('%F %T %Z%z') > '2015-11-30 16:45:51 EST-0500' > > I agree. It would be nice to have something like this (and agree whether > it should return 17:45 or 16:45). However, this has nothing to do with > having or not having the bug in the library that we ship today. > > At some point we draw the line saying that the standard datetime module > will only implement fixed offset timezones. These timezones were > implemented and for the most part correctly. At least in the examples that > I provided the code works the way I expect it to work. > Ah. I think I finally understand why we're failing to communicate. Your claim is merely that the stdlib doesn't have the means to construct a tzinfo with the desirable behavior from the timezone information available in the time module. We have the tzname tuple and the offset for the current time, but nothing about the DST transition algorithm, so the best we can do is create a fixed-offset tzinfo for the current offset. I will give you that. What I am objecting to is that when you ask pytz to construct a datetime from a DST-aware timezone, it produces a datetime whose tzinfo has a fixed offset. Such a datetime then exhibits the same behavior as shown in your example for LocalZone, even though pytz has the DST rules for the original timezone -- and this is what I called "trading one bug for another" (since pytz does this in order to obtain timeline arithmetic). PS. I've never used pytz or tried to read its docs. I am merely going by what I think I read during this discussion. If I've got my facts wrong then I apologize. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sun Aug 23 01:43:54 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 22 Aug 2015 19:43:54 -0400 Subject: [Datetime-SIG] Is EDT a timezone? Was: PEP-0500 In-Reply-To: References: Message-ID: On Sat, Aug 22, 2015 at 7:02 PM, Guido van Rossum wrote: [Alexander Belopolsky] > The feature lack of which you may find disappointing is that we don't have >> the means in the stdlib to get the following: >> >> >>> t = datetime.now(LocalZone) >> >>> t.strftime('%F %T %Z%z') >> '2015-08-22 17:45:51 EDT-0400' >> >>> (t + timedelta(100)).strftime('%F %T %Z%z') >> '2015-11-30 16:45:51 EST-0500' >> >> I agree. It would be nice to have something like this (and agree whether >> it should return 17:45 or 16:45). However, this has nothing to do with >> having or not having the bug in the library that we ship today. >> >> At some point we draw the line saying that the standard datetime module >> will only implement fixed offset timezones. These timezones were >> implemented and for the most part correctly. At least in the examples that >> I provided the code works the way I expect it to work. >> > [Guido van Rossum] > > Ah. I think I finally understand why we're failing to communicate. Your > claim is merely that the stdlib doesn't have the means to construct a > tzinfo with the desirable behavior from the timezone information available > in the time module. We have the tzname tuple and the offset for the current > time, but nothing about the DST transition algorithm, so the best we can do > is create a fixed-offset tzinfo for the current offset. I will give you > that. > It's not that we don't have the means. After all, the mktime-based LocalZone implementation have been presented in the library manual for ages. I think it was the same ambiguity about folds and gaps and the traditional vs. timeline arithmetic that prevented it from getting into the datetime module. The fixed offset timezones solved a very practical problem: you get an email for someone in Australia with a timestamp in UTC+10:00 timezone (which includes +1000 in it) and you want to compare it to datetime.now(). This problem is solved in Python 3.3+. As a bonus, you also have means of doing the timeline arithmetic, but not as easily as one might wish. > What I am objecting to is that when you ask pytz to construct a datetime > from a DST-aware timezone, it produces a datetime whose tzinfo has a fixed > offset. Such a datetime then exhibits the same behavior as shown in your > example for LocalZone, even though pytz has the DST rules for the original > timezone -- and this is what I called "trading one bug for another" (since > pytz does this in order to obtain timeline arithmetic). > The author of pytz had no means of storing the extra bit necessary to do timeline arithmetic in local time notation in the datetime object. So for him, what was a more or less an arbitrary choice for us was a necessity. As Tim characterized it, Stuart made a heroic effort to overcome that limitation. Hopefully with PEP 495 we will resolve this issue once and for all. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sun Aug 23 02:15:37 2015 From: guido at python.org (Guido van Rossum) Date: Sat, 22 Aug 2015 17:15:37 -0700 Subject: [Datetime-SIG] Is EDT a timezone? Was: PEP-0500 In-Reply-To: References: Message-ID: On Sat, Aug 22, 2015 at 4:43 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > On Sat, Aug 22, 2015 at 7:02 PM, Guido van Rossum > wrote: > [Alexander Belopolsky] > >> The feature lack of which you may find disappointing is that we don't >>> have the means in the stdlib to get the following: >>> >>> >>> t = datetime.now(LocalZone) >>> >>> t.strftime('%F %T %Z%z') >>> '2015-08-22 17:45:51 EDT-0400' >>> >>> (t + timedelta(100)).strftime('%F %T %Z%z') >>> '2015-11-30 16:45:51 EST-0500' >>> >>> I agree. It would be nice to have something like this (and agree >>> whether it should return 17:45 or 16:45). However, this has nothing to >>> do with having or not having the bug in the library that we ship today. >>> >>> At some point we draw the line saying that the standard datetime module >>> will only implement fixed offset timezones. These timezones were >>> implemented and for the most part correctly. At least in the examples that >>> I provided the code works the way I expect it to work. >>> >> [Guido van Rossum] > >> >> Ah. I think I finally understand why we're failing to communicate. Your >> claim is merely that the stdlib doesn't have the means to construct a >> tzinfo with the desirable behavior from the timezone information available >> in the time module. We have the tzname tuple and the offset for the current >> time, but nothing about the DST transition algorithm, so the best we can do >> is create a fixed-offset tzinfo for the current offset. I will give you >> that. >> > > It's not that we don't have the means. After all, the mktime-based LocalZone > implementation have been presented in the library manual for ages. I think > it was the same ambiguity about folds and gaps and the traditional vs. > timeline arithmetic that prevented it from getting into the datetime module. > > The fixed offset timezones solved a very practical problem: you get an > email for someone in Australia with a timestamp in UTC+10:00 timezone > (which includes +1000 in it) and you want to compare it to datetime.now(). > This problem is solved in Python 3.3+. As a bonus, you also have means of > doing the timeline arithmetic, but not as easily as one might wish. > I would have solved this problem (and any other problem that requires timeline arithmetic) by converting to a timestamp and comparing to time.time(). > > >> What I am objecting to is that when you ask pytz to construct a datetime >> from a DST-aware timezone, it produces a datetime whose tzinfo has a fixed >> offset. Such a datetime then exhibits the same behavior as shown in your >> example for LocalZone, even though pytz has the DST rules for the original >> timezone -- and this is what I called "trading one bug for another" (since >> pytz does this in order to obtain timeline arithmetic). >> > > The author of pytz had no means of storing the extra bit necessary to do > timeline arithmetic in local time notation in the datetime object. So for > him, what was a more or less an arbitrary choice for us was a necessity. > As Tim characterized it, Stuart made a heroic effort to overcome that > limitation. Hopefully with PEP 495 we will resolve this issue once and for > all. > But PEP 495 doesn't add timeline arithmetic (it merely makes it easier to convert between timestamps and datetimes and back, except for the rounding issue). I wonder why Stuart needed timeline arithmetic? Merely being able to access the Olson database doesn't sound enough of a reason for such heroism. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sun Aug 23 02:37:11 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 22 Aug 2015 19:37:11 -0500 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: [Alexander Belopolsky] > Please keep hammering. If we add an additional member to the datetime > objects, I want the new API to be good for the next 20 years and not just > wet the appetite for adding more and more with every Python release. For > example, if anyone can find a place on Earth where 01:30 AM was repeated 3 > times in one day - I will be all for changing ltdf type from boolean to > integer. Not because I think it is important to support that one exotic > event, but because if that was done once somewhere it will certainly be done > again somewhere else. If we want to be future-proof, I think ltdf needs to change from a flag to a 64-bit unsigned int ;-) What _will_ happen is that computer programmers will refuse to support "unreasonable" (to them) changing realities, until the current generation of programmers dies off. After all, "POSIX time" is just a mathematical abstraction that ceased being connected directly to civil time just a few years after POSIX defined it (when the Change That Must Not Be Named was added to the definition of UTC). The way it's still defended, you'd think Guido invented it ;-) So this is what happens: the long-overdue massive earthquake on the US East Coast finally knocks New York off the North American continent, and sends it drifting east. After coasting about 1000 miles.east, it stops. Mayor De Blasio will take a brave political stand, and be re-elected on a platform that includes switching New New York to a more appropriate time zone, in the coming fall when DST ends too. But city IT Professionals will tell him they can't manage shifting two hours in one jump. So, when DST ends at 2AM EDT, the clock will jump back to 1AM EST. Then when it hits 2AM EST, it will jump back to 1AM NEWNEWYORKST. Python's 2-state flag will be inadequate to distinguish among the 3-way ambiguities. Timezone wonks will scream. Nobody else will care. Exactly the same kind of scandal as when zoneinfo incorrectly claimed that America/Whitehorse switched from UTC-9 to UTC-8 on 1966-07-01 instead of the correct 1967-05-28. Although, to be fair, I'm pretty sure it will take more than 20 years for New New York to drift that far east, so a single bit should be "good for the next 20 years" after all ;-) From tim.peters at gmail.com Sun Aug 23 04:05:16 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 22 Aug 2015 21:05:16 -0500 Subject: [Datetime-SIG] Is EDT a timezone? Was: PEP-0500 In-Reply-To: References: Message-ID: [Guido] > But PEP 495 doesn't add timeline arithmetic (it merely makes it easier to > convert between timestamps and datetimes and back, except for the rounding > issue). That's kinda like saying Python doesn't do anything (it merely makes it easier to write programs ;-) ). PEP 495 supplies something needed to go on to implement timeline arithmetic, and to reliably _convert_ between timezones (which you may think of as converting to timestamps, doing arithmetic, then converting back - but astimezone() is "a primitive" to datetime users). > I wonder why Stuart needed timeline arithmetic? Merely being able to access > the Olson database doesn't sound enough of a reason for such heroism. dateutil also wraps the Olson database (among several other sources of timezone info), but made no effort to change Python's default haphazard treatment of folds and gaps. I'd be interested to hear Stuart's thoughts on this. Back when I was searching the web for clues, I mostly saw people (blogs, message threads) speculating about _conversion_ (not mostly arithmetic) surprises. Note that once 495-compliant tzinfos exist, zone conversions using them will work correctly in all cases by magic (no change to arithmetic is required - classic arithmetic is what's needed for those and we already have it - conversions today only trip over the current inability to disambiguate folds - as the docs say, for the default .fromutc() to mimic the local clock in all cases, .dst() must consider an ambiguous time to be "in standard time" - which is wrong "half the time"). Here's a "typical enough" bloggish compilation of possible complaints. http://www.assert.cc/2014/05/25/which-python-time-zone-library.html Also typical is its: """ It may seem unlikely that your application will ever hit the problem I?ve demonstrated here, but in many cases I bet you can imagine how it is possible, and that?s enough for me. I prefer to err on the side of correctness. """ and its: """ I will (hopefully) never need to care about them. I avoid them by simply using UTC whenever possible. """ Your predilection for thinking of datetimes as timestamps is, I believe, more natural in the datetime context as a predilection for sticking to UTC (which is really just a richer - and better-behaved (due to larger range and lack of floating-point surprises) - way of spelling a POSIX timestamp extended to microsecond precision). From alexander.belopolsky at gmail.com Sun Aug 23 04:12:13 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 22 Aug 2015 22:12:13 -0400 Subject: [Datetime-SIG] PEP 495 Q & A Message-ID: Dear SIG, I have expanded the Q & A section of the PEP [1] and made some other changes based on your feedback. My goal is to address in the Q & As every concern that have been raised and not resulted in a change to the man text. Experience shows that I am very bad at collecting the questions. I invent the questions that no one asked and miss those that some of you did. Please let me know if I did not address your feedback. Note that I did not include all suggestions for the name of the flag, but I thank everyone who made their suggestions. I think we are really left with two contenders: "fold" and "later." The only additional variant I would like to consider is "fold" with the integer values of 0 and 1. I think time(1, 30, fold=1) is short and sweet and looks better than time(1, 30, later=True). Note that neither spelling is self-explanatory, particularly if you see something like if dt.replace(later=True) < dt.replace(later=False) in someone's code, but the word "fold" points you in the right direction and is more Google-friendly than "later". The reason I think fold=0 and fold=1 may work better than booleans, is that you can think of the local time line as consisting of two "folds" one - the main timeline and the other a discontinuous line covering the fall-back hours. [1]: https://www.python.org/dev/peps/pep-0495/#questions-and-answers -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sun Aug 23 04:20:26 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 22 Aug 2015 21:20:26 -0500 Subject: [Datetime-SIG] PEP 495 Q & A In-Reply-To: References: Message-ID: [Alexander Belopolsky] > ... > Note that I did not include all suggestions for the name of the flag, but I > thank everyone who made their suggestions. I think we are really left with > two contenders: "fold" and "later." The only additional variant I would > like to consider is "fold" with the integer values of 0 and 1. I think > time(1, 30, fold=1) is short and sweet and looks better than time(1, 30, > later=True). > > Note that neither spelling is self-explanatory, particularly if you see > something like if dt.replace(later=True) < dt.replace(later=False) in > someone's code, but the word "fold" points you in the right direction and is > more Google-friendly than "later". > > The reason I think fold=0 and fold=1 may work better than booleans, is that > you can think of the local time line as consisting of two "folds" one - the > main timeline and the other a discontinuous line covering the fall-back > hours. I'm on board with fold=0 and fold=1. I only hated "fold" when it was False and True. Now we're indexing a theoretically unbounded sequence of folds by an ordinal, which makes perfect sense - the later the time, the larger the ordinal ;-) From alexander.belopolsky at gmail.com Sun Aug 23 04:36:16 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 22 Aug 2015 22:36:16 -0400 Subject: [Datetime-SIG] PEP 495 Q & A In-Reply-To: References: Message-ID: On Sat, Aug 22, 2015 at 10:20 PM, Tim Peters wrote: > I'm on board with fold=0 and fold=1. I only hated "fold" when it was > False and True. Now we're indexing a theoretically unbounded sequence > of folds by an ordinal, which makes perfect sense - the later the > time, the larger the ordinal ;-) > Great! I'll let it simmer for a few days and start making the change in the PEP and the code. I don't think it will be right to call it a "flag" anymore. What will be the right word: a fold index? -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sun Aug 23 04:42:15 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 22 Aug 2015 22:42:15 -0400 Subject: [Datetime-SIG] PEP 495 Q & A In-Reply-To: References: Message-ID: On Sat, Aug 22, 2015 at 10:36 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > On Sat, Aug 22, 2015 at 10:20 PM, Tim Peters wrote: > >> I'm on board with fold=0 and fold=1. I only hated "fold" when it was >> False and True. Now we're indexing a theoretically unbounded sequence >> of folds by an ordinal, which makes perfect sense - the later the >> time, the larger the ordinal ;-) >> > > Great! I'll let it simmer for a few days and start making the change in > the PEP and the code. > > I don't think it will be right to call it a "flag" anymore. What will be > the right word: a fold index? > As an added benefit, we can get rid of the "ambiguous time" tongue-twister and call those intervals "two-fold". -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sun Aug 23 05:17:41 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 22 Aug 2015 22:17:41 -0500 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: [Tim] >> Which is why arithmetic "belongs in" tzinfo too - how the local clock >> acts is entirely about the timezone. [Alexander] > While this is a very logical conclusion, I find it challenging to explain > why arithmetic selection flag "belongs" to tzinfo while the "local time > disambiguation flag" "belongs" to the datetime instance. Because "the rules" governing how the local clock works are defined by the tzinfo, but what the local clock _displays_ is entirely in the datetime. And I'll now make that crystal clear ;-) I hate digital clocks. Because I grew up with analog clocks, and to this very day when I see a digital clock I have to picture where "the big hand" and "the little hand" are. Otherwise I have no _real_ idea what time it is. So, on my monitor, I went to great lengths to find a non-cool analog clock gadget. Plain circular clock face, showing only 12 hours, white background with black sans serif digits at each hour position, plain black tick marks at each minute position. a plain black "hour hand" and a longer black "minute hand", and no stinking useless "second hand" ;-) But there's _also_ an "AM" or "PM" near the 6, because while I hate 24-hour analog clocks too, the AM/PM distinction is important to me. So what's wrong with this? Not much. It's confusing when - and only when - I happen to be awake in the wee hours when daylight time ends. The rules for when that happens are buried in the OS (my "tzinfo" implementation), but to be _maximally_ useful to me my clock gadget (my "datetime") should _also_ display "fold=0" and "fold=1" across the hours daylight time ends. The lack of that indicator isn't a problem with my tzinfo, it's a design flaw in my datetime instance. And that's why fold belongs in the datetime. Nobody realizes this only because they've never had a clock that was designed by me ;-) > It feels backwards: DST is the stuff about timezones while arithmetic > is the stuff about datetime. Yet, we have what we have. In an > alternative universe, maybe we could have a DateTime metaclass that > would produce a separate datetime class for each timezone and then > tzinfo would be a class variable rather than instance member. General fact of life: when someone takes a simple design with a few warts that's nevertheless understandable with minor effort, and tries to improve it by piling on layers of OO concepts, the result is invariably something only the improver can make head or tail of. Other than that, great idea ;-) From tim.peters at gmail.com Sun Aug 23 05:49:05 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 22 Aug 2015 22:49:05 -0500 Subject: [Datetime-SIG] PEP 495 Q & A In-Reply-To: References: Message-ID: [Alexander] > ... > I don't think it will be right to call it a "flag" anymore. What will be > the right word: a fold index? I'd go for Fold Indicator Related to Standard Time. Then we can just call it "first" for short ;-) From stuart at stuartbishop.net Sun Aug 23 07:46:37 2015 From: stuart at stuartbishop.net (Stuart Bishop) Date: Sun, 23 Aug 2015 12:46:37 +0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On 23 August 2015 at 04:46, Chris Barker wrote: > On Sat, Aug 22, 2015 at 11:55 AM, Tim Peters wrote: > >> >> What we're left with >> appears to be just one other bit, spelled via the presence or absence >> of a new magic attribute on a tzinfo instance, or via inheritance or >> non-inheritance from a new marker class. That new bit is used to >> spell "classic or timeline arithmetic?" (which datetime internals - >> not tzinfos - will be required to implement). > > > if it would be implemented by datetime, or a datetime subclass, wouldn't it > make sense for that attribute to be on the datetime instance, rather than a > tzinfo instance? > > Anyway, I have an interest in seeing that done -- and Alexander is right, it > would be better to make whatever changes we need to datetime now in a way > that will last. > > so is it time to add an attribute to specify "timeline" arithmetic somewhere > now? > > BTW, as I read it, PEP 431's biggest contribution was to bring access to a > timezone database into the stdlib -- is that idea dead? Putting a database updated a dozen times per year, often at short notice, into stdlib is a bad idea. Putting the necessary tzinfo implementations to read the existing timezone database on your platform is a great idea, as is distributing the Olson timezone database via pip for less well endowed platforms. And a 'local' tzinfo instance is pretty much a requirement. I ty to move as much as pytz and Lennart's tzlocal into stdlib as possible. Exactly how that happens depends on the outcome of this PEP-495. Ideally, pytz will cease to be required at all because Python stdlib will be able to give you unambiguous, correct datetime arithmetic ('timeline') out of the box using the timezone database installed on your system. I robot can push the zoneinfo database to pypi, or I will until the robot is setup. On Python 3.whatever, the pytz library will just wrap stdlib to provide backwards compatibility. Failing the ideal situation, pytz will remain for users needing timeline arithmetic but will still offload what it can to stdlib and no longer require use of the localize() and normalize() methods (ie. it will work as you expect, not as it does today). -- Stuart Bishop http://www.stuartbishop.net/ From stuart at stuartbishop.net Sun Aug 23 08:13:31 2015 From: stuart at stuartbishop.net (Stuart Bishop) Date: Sun, 23 Aug 2015 13:13:31 +0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On 21 August 2015 at 23:59, Alexander Belopolsky wrote: > > On Fri, Aug 21, 2015 at 8:07 AM, Stuart Bishop > wrote: >> >> - The rules in PEP-495 for utcoffset() and dst() to deal with >> ambiguous times only work in simple cases, as there dst offsets both >> more and less than 1 hour, and there is no stdoffset since the offset >> can change at the same time (eg. Europe/Vilnius 1941, where the clocks >> ended up going backwards for summer time instead of forwards). > > > Instead of engaging in a theoretical discussion, I went ahead and added this > transition as a test case to my reference implementation. Please review [1] > and let me know if you see any issues. > > [1]: > https://github.com/abalkin/cpython/commit/9f683c8d0f6f2b48aad81ae4e5e8a118a542d2d4 Seems fine. If you want more amusing test cases, the ones I have tripped over are in http://bazaar.launchpad.net/~stub/pytz/devel/view/head:/src/pytz/tests/test_tzinfo.py. I think the one least likely to be covered is Pacific/Apia (Samoa) in 2011, when they jumped the international dateline. No dst transition, but the offset changed from -10 to +14 and they skipped Dec 30th entirely. -- Stuart Bishop http://www.stuartbishop.net/ From guido at python.org Mon Aug 24 02:42:59 2015 From: guido at python.org (Guido van Rossum) Date: Sun, 23 Aug 2015 17:42:59 -0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On Sat, Aug 22, 2015 at 10:46 PM, Stuart Bishop wrote: > On 23 August 2015 at 04:46, Chris Barker wrote: > > On Sat, Aug 22, 2015 at 11:55 AM, Tim Peters > wrote: > > > >> > >> What we're left with > >> appears to be just one other bit, spelled via the presence or absence > >> of a new magic attribute on a tzinfo instance, or via inheritance or > >> non-inheritance from a new marker class. That new bit is used to > >> spell "classic or timeline arithmetic?" (which datetime internals - > >> not tzinfos - will be required to implement). > > > > > > if it would be implemented by datetime, or a datetime subclass, wouldn't > it > > make sense for that attribute to be on the datetime instance, rather > than a > > tzinfo instance? > > > > Anyway, I have an interest in seeing that done -- and Alexander is > right, it > > would be better to make whatever changes we need to datetime now in a way > > that will last. > > > > so is it time to add an attribute to specify "timeline" arithmetic > somewhere > > now? > > > > BTW, as I read it, PEP 431's biggest contribution was to bring access to > a > > timezone database into the stdlib -- is that idea dead? > > Putting a database updated a dozen times per year, often at short > notice, into stdlib is a bad idea. Putting the necessary tzinfo > implementations to read the existing timezone database on your > platform is a great idea, as is distributing the Olson timezone > database via pip for less well endowed platforms. And a 'local' tzinfo > instance is pretty much a requirement. I ty to move as much as pytz > and Lennart's tzlocal into stdlib as possible. Exactly how that > happens depends on the outcome of this PEP-495. Ideally, pytz will > cease to be required at all because Python stdlib will be able to give > you unambiguous, correct datetime arithmetic ('timeline') out of the > box using the timezone database installed on your system. I robot can > push the zoneinfo database to pypi, or I will until the robot is > setup. On Python 3.whatever, the pytz library will just wrap stdlib to > provide backwards compatibility. > > Failing the ideal situation, pytz will remain for users needing > timeline arithmetic but will still offload what it can to stdlib and > no longer require use of the localize() and normalize() methods (ie. > it will work as you expect, not as it does today). > Apart from the breathless rendition that's pretty much the hope, yes. Would going forward with PEP 495 as it currently stands (a single bit to distinguish ambiguous times) be a problem? I would really like to finish this endless (oh irony, when talking about time :-) discussion. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Aug 24 04:49:14 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 23 Aug 2015 21:49:14 -0500 Subject: [Datetime-SIG] Implementing tzinfo for all valid datetimes (was Re: PEP-431/495) Message-ID: [Tim] > Let me be clearer about this. I appreciate that Olson-general > timezones are a PITA to implement both compactly and efficiently. > ... Looks like I need to elaborate on that. It could well be that I'm using pytz incorrectly, but best I can tell it only handles a relatively small range of Python datetimes. Here using: >>> pytz.__version__ '2015.4' under Python 3.4.3, on Windows 10 Pro. from pytz import timezone from datetime import datetime def tostr(dt): return dt.strftime("%Y-%m-%d %H:%M:%S %Z%z") uz = timezone("utc") ez = timezone("US/Eastern") u = uz.localize(datetime(2015, 8, 23, 20)) I don't much care what time I'm starting with - it happens to be "today" as I type, but the only thing of interest is that it's firmly in US/Eastern daylight time (nowhere near any transitions). So let's check: for i in range(30): u2 = u.replace(year=u.year + i) e = ez.normalize(u2.astimezone(ez)) print(tostr(u2), "is", tostr(e)) giving: 2015-08-23 20:00:00 UTC+0000 is 2015-08-23 16:00:00 EDT-0400 2016-08-23 20:00:00 UTC+0000 is 2016-08-23 16:00:00 EDT-0400 2017-08-23 20:00:00 UTC+0000 is 2017-08-23 16:00:00 EDT-0400 2018-08-23 20:00:00 UTC+0000 is 2018-08-23 16:00:00 EDT-0400 2019-08-23 20:00:00 UTC+0000 is 2019-08-23 16:00:00 EDT-0400 2020-08-23 20:00:00 UTC+0000 is 2020-08-23 16:00:00 EDT-0400 2021-08-23 20:00:00 UTC+0000 is 2021-08-23 16:00:00 EDT-0400 2022-08-23 20:00:00 UTC+0000 is 2022-08-23 16:00:00 EDT-0400 2023-08-23 20:00:00 UTC+0000 is 2023-08-23 16:00:00 EDT-0400 2024-08-23 20:00:00 UTC+0000 is 2024-08-23 16:00:00 EDT-0400 2025-08-23 20:00:00 UTC+0000 is 2025-08-23 16:00:00 EDT-0400 2026-08-23 20:00:00 UTC+0000 is 2026-08-23 16:00:00 EDT-0400 2027-08-23 20:00:00 UTC+0000 is 2027-08-23 16:00:00 EDT-0400 2028-08-23 20:00:00 UTC+0000 is 2028-08-23 16:00:00 EDT-0400 2029-08-23 20:00:00 UTC+0000 is 2029-08-23 16:00:00 EDT-0400 2030-08-23 20:00:00 UTC+0000 is 2030-08-23 16:00:00 EDT-0400 2031-08-23 20:00:00 UTC+0000 is 2031-08-23 16:00:00 EDT-0400 2032-08-23 20:00:00 UTC+0000 is 2032-08-23 16:00:00 EDT-0400 2033-08-23 20:00:00 UTC+0000 is 2033-08-23 16:00:00 EDT-0400 2034-08-23 20:00:00 UTC+0000 is 2034-08-23 16:00:00 EDT-0400 2035-08-23 20:00:00 UTC+0000 is 2035-08-23 16:00:00 EDT-0400 2036-08-23 20:00:00 UTC+0000 is 2036-08-23 16:00:00 EDT-0400 2037-08-23 20:00:00 UTC+0000 is 2037-08-23 16:00:00 EDT-0400 2038-08-23 20:00:00 UTC+0000 is 2038-08-23 15:00:00 EST-0500 2039-08-23 20:00:00 UTC+0000 is 2039-08-23 15:00:00 EST-0500 2040-08-23 20:00:00 UTC+0000 is 2040-08-23 15:00:00 EST-0500 2041-08-23 20:00:00 UTC+0000 is 2041-08-23 15:00:00 EST-0500 2042-08-23 20:00:00 UTC+0000 is 2042-08-23 15:00:00 EST-0500 2043-08-23 20:00:00 UTC+0000 is 2043-08-23 15:00:00 EST-0500 2044-08-23 20:00:00 UTC+0000 is 2044-08-23 15:00:00 EST-0500 Oops! Somewhere around 2037-2038 it apparently lost all knowledge of US/Eastern daylight time. I expect this is why: >>> ez._utc_transition_times[-1] datetime.datetime(2037, 11, 1, 6, 0) That is, the last transition it knows about is the end of daylight time in 2037. In general, as I understand it, Olson-derived tzfiles reduce most calculation to uniform binary search across a precomputed, exhaustive, sorted list of transition instants, at the expense of needing to store that exhaustive list. It buys some speed and much client-code simplicity at the cost of client-side data space. >>> len(ez._utc_transition_times) 237 But at least this version of tzfile doesn't store all that many. It would require over 15,000 entries to extend this way of doing it through year 9999 (where Python's datetime ends). Does that really matter? I don't know. None of this matters much to me ;-) But a scheme that's striving to be anally correct about precise seconds for years a century ago in places nobody ever heard of ;-) should really try to make a reasonable guess about years just a few decades from now. Digging deeper, I don't think I can pin this on tzfile. The docs say that, if possible, a tzfile also contains a POSIX-TZ-style rule to be used for times beyond the last explicit transition instant. In the US/Eastern tzfile shipped with this version of pytz, that's: EST5EDT,M3.2.0,M11.1.0 So a "complete" wrapping of zoneinfo also requires implementing such rules when present. And then we're back where I started: the puzzle of how to do so both efficiently and compactly. It won't be that long before explicit transition lists ending in 2037 will be useless for almost all real-life purposes :-( There's one trick that could be used as a compromise: things like "second Sunday in March" give exactly the same result (like "the second Sunday in March is March 9") for years 400 apart (every 400-year span starting at date D in the proleptic Gregorian calendar looks exactly the same as the 400-year span starting at date D+400*i, for all integer i, and where "+" is interpreted as "add to the year component"). So an exhaustive list covering a (any) 400-year span suffices to do those kinds of calculations for all years (add or subtract multiples of 400 to/from the year until hitting a year in the canonical 400-year span). From tim.peters at gmail.com Mon Aug 24 09:05:57 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 24 Aug 2015 02:05:57 -0500 Subject: [Datetime-SIG] Implementing tzinfo for all valid datetimes (was Re: PEP-431/495) In-Reply-To: References: Message-ID: [Tim] > Oops! Somewhere around 2037-2038 it apparently lost all knowledge of > US/Eastern daylight time. I expect this is why: > > >>> ez._utc_transition_times[-1] > datetime.datetime(2037, 11, 1, 6, 0) > > That is, the last transition it knows about is the end of daylight time in 2037. ... > > Digging deeper, I don't think I can pin this on tzfile. The docs say > that, if possible, a tzfile also contains a POSIX-TZ-style rule to be > used for times beyond the last explicit transition instant. In the > US/Eastern tzfile shipped with this version of pytz, that's: > > EST5EDT,M3.2.0,M11.1.0 > > So a "complete" wrapping of zoneinfo also requires implementing such > rules when present. This appears to be the scoop, although I may be wrong about some: when tzfile was first invented, like most other stuff at the time it assumed the world would end before 2038 (the first year a signed 32-bit int is too narrow to hold a UNIX(tm) seconds-since-1970 timestamp). Values in a tzfile were all at most 4 bytes, zic generated all transitions explicitly through the end of 2037, and that was that. Sometime later, but before the current NEWS file goes back, version 2 of tzfile was introduced. This added a new section allowing for 8-byte data, and with that came the realization that generating all transitions explicitly was a doomed approach. So version 2 also added the POSIX-TZ gimmick: so long as the most recent behavior was regular enough to use a TZ rule, there was no need to generate any explicit transitions covered by that rule. But what about old clients, who used version 1? Would updates to zones become useless to them because they couldn't deal with version 2 yet? A comment in zic.c's `outzone()` function: /* ** For the benefit of older systems, ** generate data from 1900 through 2037. */ So that's why they still generate everything explicitly through 2037: the first piece of a version 2 (and version 3) tzfile _is_ a version 1 tzfile, and ancient software expecting version 1 can still use current version 3 tzfiles without problems. But "modern" software is expected to use the TZ rule - they're never going to generate explicit transitions beyond 2037 except when a TZ rule is inadequate to express them. They only generate them now for the benefit of legacy systems. Python is aging, but it's not _that_ old yet ;-) From rosuav at gmail.com Mon Aug 24 09:50:03 2015 From: rosuav at gmail.com (Chris Angelico) Date: Mon, 24 Aug 2015 17:50:03 +1000 Subject: [Datetime-SIG] Implementing tzinfo for all valid datetimes (was Re: PEP-431/495) In-Reply-To: References: Message-ID: On Mon, Aug 24, 2015 at 12:49 PM, Tim Peters wrote: > Does that really matter? I don't know. None of this matters much to > me ;-) But a scheme that's striving to be anally correct about > precise seconds for years a century ago in places nobody ever heard of > ;-) should really try to make a reasonable guess about years just a > few decades from now. tzdata can't accurately represent future timestamps anyway, as they can change with barely a week's notice (cf Egypt and DST, and North Korea moving half an hour away from Japan); the past, however, we should be able to be sure of. ChrisA From stuart at stuartbishop.net Mon Aug 24 15:56:11 2015 From: stuart at stuartbishop.net (Stuart Bishop) Date: Mon, 24 Aug 2015 20:56:11 +0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On 22 August 2015 at 03:49, Tim Peters wrote: >> - I want a boolean added to datetime instances, even if I don't like >> the name, because I can then deprecate pytz and its confusing API and >> implementation. I'm happy to work on Python implementation and >> documentation. It will save me time and effort in the long run. > > Later you seem to say you'd prefer a 3-state flag instead, so not sure > you really mean "boolean" here. I write Python and SQL for a living. Booleans are 3 state to me ;) In this case, I'm not fussed if the datetime instance has a 2 state or 3 state flag. This is different to the various constructors which I think need a 3 state flag in their arguments. True, False, None. >> - Most of my thoughts got encoded in PEP-431. This would give us a >> datetime module that operates exactly the way it does today, > > No. While 431 was highly obscure on this point, it turned out that > Lennart was determined to change arithmetic behavior. That can't fly, > for backward compatibility, and because even "aware" datetimes were > intended to use a "naive time" model internally. > > Specifically, if you add timedelta(days=1) to a datetime today, you > get "same time tomorrow" (day goes up by 1, but hour, minute, second > and microsecond remain the same) in all cases. Even if a DST > transition (or base-offset change, or leap-second change) occurred. > That's now called "classic" arithmetic. The default behavior can't be > changed. > > What you seem to have in mind (accounting for two of the three known > reasons for why a local clock may jump: DST and base-offset changes, > but not leap second changes) is now called "timeline" (sometimes > "strict") arithmetic. Grump. I always interpreted that documentation to mean that timezone conversions where *my* problem as the author of the tzinfo implementation. I thought it was a documented problem to be fixed if/when Python ever provided more complex tzinfo implementations, and one of the reasons it never did provide such implementations in the first place. Classic behaviour as you describe it is a bug. It sounds ok when you state it as 'add one day to today and you get the same time tomorrow'. It does not sound ok when you state it as 'add one second to now and you will normally get now + 1 second, but sometimes you will get an instant further in the future, and sometimes you will get an instant in the past'. I dispute that this is default behaviour that can't be changed. The different arithmetic only matters when you have a dst-aware aware datetime in play, and Python has never provided any apart perhaps from your original reference implementation (which stopped working in 2006). pytz, however, has always provided timeline arithmetic. I believe this is the most widely deployed way of obtaining dst-aware datetime instances, and this is the most widely expected behaviour. If you use pytz tzinfo instances, adding 1 second always adds one second and adding 1 day always adds 24 hours. While calendaring style arithmetic is useful and a valid use case, it is useless if the only relative type is the day. You also need months and years and periodic things like 'first sunday every month'. This is too complex to inflict its API on people by default. But pulling in dateutils relative time helpers could be nice. Do systems that rely on classic behavior actually exist? It requires someone to have explicitly chosen to use daylight savings capable timezones, without using pytz, while at the same time relying on classic's surprising arithmetic. Maybe systems using dateutils without using dateutils' implementation of datetime arithmetic. I believe that there are many more systems out there that are broken by this behaviour than are relying on this behaviour. I think this is a bug worth fixing rather than entrenching, before adding any dst aware tzinfo implementations to stdlib (including 'local'). > According to Lennart, under PEP 431 timeline arithmetic would always > be used. Under PEP 495, nothing about arithmetic changes. 495 is > less ambitious, only intending to supply the bit(s) needed to _allow_ > timeline arithmetic to be implemented as an option later. PEP 500 is > about supplying different arithmetics, but Guido hates PEP 500. Ok. However... this also means the new flag on the datetime instances is largely irrelevant to pytz. pytz' API will need to remain the same. Adding a timedelta to a datetime will give you a datetime in exactly the same offset() and dst() as you started with (because pytz gives you timeline arithmetic, where adding 24 hours actually adds 24 hours), and you will need to fix it using the normalize method after the fact. The is_dst bit is effectively stored on the tzinfo instance currently in play, and having another copy on the datetime instance unnecessary. The new argument to the datetime constructors may be useful, if it accepts tri-state. If the is_dst/first flag accepts True, False or None, then pytz may be able to deprecate the localize method. If a user calls localize(is_dst=None), AmbiguousTImeError and NonExistantTimeError exceptions may be raised, but by default exceptions are not raised. I would also need the opportunity to swap in the correct fixed offset tzinfo instance for the given datetime. (example below) Losing the localize method will be a huge win for pytz, as it is ugly and causes great confusion and many identical bug reports. The other problem, the normalize method, is less important - if you neglect to call normalize you still get the correct instant, but it may be reported in the incorrect timezone period (EST instead of EDT or vice versa). > In the end, I expect timezone wrappers will supply factory functions, > either separate functions for "give me such-and-such a timezone using > classic arithmetic" and "give me such-and-such a timezone using > timeline arithmetic", or a single function specifying the desired > timezone and an optional flag to specify the arithmetic desired. pytz users need to be able to construct datetimes that get silently normalized if they are ambiguous or non-existant. Some pytz users need to have exceptions raised if they attempt to construct datetimes that are ambiguous or non-existant. This is what I consider strict vs loose. Ideally: >>> str(datetime(2004, 4, 4, 2, 0, 0, tzinfo=eastern)) '2004-04-04 03:00:00 -04:00' >>> str(datetime(2004, 4, 4, 2, 0, 0, tzinfo=eastern, first=None) Traceback: ... pytz.NonExistantTimeError() I also need to continue to support timeline arithmetic. This requires me not having a single tzinfo instance, but swapping in the correct fixed offset tzinfo instance at the right time. Currently, this uses the awful localize and normalize methods. Ideally, postPEP: >>> eastern = pytz.timezone('US/Eastern') >>> dt = datetime(2004, 4, 3, 2, 0, 0, tzinfo=eastern) >>> dt2 = dt + timedelta(days=1) >>> eastern is dt.tzinfo False >>> dt.tzinfo is dt2.tzinfo False >>> str(dt) '2004-04-03 02:00:00-05:00' >>> str(dt2) '2004-04-03 03:00:00-04:00' If I can do this, there is no reason that pytz could not also support 'classic' style, but I certainly wouldn't want to encourage its use as my rant above might indicate ;) If I write documentation, it may require some editing, localizing from en_AU to something a little more polite. > It's possible that 495 should do more in this direction. For now, it > specifies enough that someone who cares can easily write a function to > distinguish among "ambiguous time (in a fold)", "invalid time" (in a > gap), and "happy time" ;-) , and do whatever _they_ want (ignore some > subset, raise an exception, print a warning, supply a default, prompt > the user for more info, ...). As long as this doesn't break pytz, as it sounds like pytz will still be needed. For pytz users, being able to write a function do tell if the data you were given is broken is a step backwards. When constructing a datetime instance with pytz, users have the choice of raising exceptions or having pytz normalize the input. They are never given broken data (by their definition), and there is no need to weed it out. > As above, it's possible 495 should do more. But it's hard to know > when to stop. For example, there are many ways of specifying a > datetime, including. e.g., using .combine() to paste a date and time > together. It's generally impossible to make a fold/gap determination > on a time alone - that's only possible in combination with a date. So > does .combine() also need to whine? It's simpler overall to leave it > to those users who care to check when they do care. I think all functions that can create datetime instances will need the new optional flag and the flag should be tri-state, defaulting to not whine. > 495 couldn't care less what causes folds and gaps - it's equally > applicable to all causes, and whether in isolation or combination. > What it _does_ assume is that a single bit suffices to resolve > ambiguities: that there is no case in which more than two UTC times > have the same spelling on a local clock. The goal of the PEP is to > supply that bit. The burden is on the tzinfo supplier to set and use > it correctly. The burden is also on the tzinfo supplier to supply a > .utcoffset() "that works" to convert a local time to UTC, to supply a > .dst() that returns whatever the tzinfo supplier thinks it should > return, and to supply a .fromutc() that sets the bit correctly. The important bit here for pytz is that tzinfo.fromutc() may return a datetime with a different tzinfo instance. Also, to drop pytz' localize method I need something like 'tzinfo.normalize(dt)', where I have the opportunity to replace the tzinfo the user provided with the one with the correct offset/dst info. >> - My argument in favour of 'is_dst' over 'first' is that this is what >> we have in the data we are trying to load. You commonly have > .> a timestamp with a timezone abbreviation and/or offset. This can >> easily be converted to an is_dst flag. > > You mean by using platform C library functions (albeit perhaps wrapped > by Python)? > >> To convert it to a 'first' flag, we need to first parse the datetime, > > I'm unclear on this. To get a datetime _at all_ the timestamp has to > be converted to calendar notation (year, month, ...). Which is what > I'm guessing "parse" means here. That much has to be done in any > case. My example is weak. I'm thinking about parsing a string like: 2004-10-31 01:15 EST-05:00 Even if you know this is US/Eastern and not Estonia, you still need to know that for dates in October EDT is first and EST is not first, and for dates in april EST is first and EDT is not first, and you need to include a wide enough fuzz factor that future changes to the DST rules won't break your parser. But I guess a general purpose parser that cares would construct instances 3 days before and a 3 days later and use whichever tzinfo had the correct offset. Or just use a fixed offset tzinfo. >> - I think datetime should consider 1 day == 24 hours and not have >> concepts like years or months, just like it does today. As others >> suggested, a separate module dealing with leap years and variable >> length days may be useful to some people, as would leapsecond support >> for astronomers and astrologers. But if the default implementation >> gives different results to all the other tools on your system, people >> will think the default is wrong. > > Not sure what you mean here without specific examples of what you have > in mind. But, as above, classic arithmetic will remain the default > regardless - it's a dozen years too late to change that, even if > everyone wanted to (and - surprise - everyone doesn't ;-) ). I despair at the bug reports, questions and general confusion that will occur if dst-aware tzinfo implementations are added to stdlib. At the moment, it is an obscure wart despite its age. It will become an in your face wart as soon as a tzlocal implementation is landed, and a wart people will be angry about because they won't realize it is there until their production system loses an hours worth of orders because their Python app spat out an hours worth of invalid timestamps right around Halloween sale time. But I'm drifting off into hyperbole. For amusement, here is how you can add an hour and end up exactly where you started. Careful you do your conversions at the right time, or the dst transition might eat your data (this example performed by a professional stuntman and should not be attempted at home): >>> from pytz.reference import Eastern >>> dt = datetime(2004, 4, 4, 1, 0, 0, tzinfo=Eastern) >>> str(dt.astimezone(timezone.utc)) '2004-04-04 06:00:00+00:00' >>> str((dt + timedelta(hours=1)).astimezone(timezone.utc)) '2004-04-04 06:00:00+00:00' -- Stuart Bishop http://www.stuartbishop.net/ From stuart at stuartbishop.net Mon Aug 24 16:24:27 2015 From: stuart at stuartbishop.net (Stuart Bishop) Date: Mon, 24 Aug 2015 21:24:27 +0700 Subject: [Datetime-SIG] Is EDT a timezone? Was: PEP-0500 In-Reply-To: References: Message-ID: On 23 August 2015 at 09:05, Tim Peters wrote: >> I wonder why Stuart needed timeline arithmetic? Merely being able to access >> the Olson database doesn't sound enough of a reason for such heroism. > > dateutil also wraps the Olson database (among several other sources of > timezone info), but made no effort to change Python's default > haphazard treatment of folds and gaps. > > I'd be interested to hear Stuart's thoughts on this. Back when I was > searching the web for clues, I mostly saw people (blogs, message > threads) speculating about _conversion_ (not mostly arithmetic) > surprises. Note that once 495-compliant tzinfos exist, zone > conversions using them will work correctly in all cases by magic (no > change to arithmetic is required - classic arithmetic is what's needed > for those and we already have it - conversions today only trip over > the current inability to disambiguate folds - as the docs say, for the > default .fromutc() to mimic the local clock in all cases, .dst() must > consider an ambiguous time to be "in standard time" - which is wrong > "half the time"). My, that was a long time ago. I think interoperability is the biggest issue. Timezones where a big hassle to Australian's writing web applications, because most of the world thought EST meant American Eastern Standard Time while we all knew the correct answer was Australian Eastern Standard Time, or for the rest of the year Australian Eastern Summer Time. So we used timezone aware timestamps to avoid these silly assumptions (and cursed zope.DateTime which hardcoded EST to the USA). And tried to keep all our timestamps in UTC as best practice. But you still had to deal with localized timestamps of course, and converting them to UTC and back to perform arithmetic was a pain in the bum and something lazy devs often didn't bother to do, and that is where the Python implementation fell down. It needed to drop the is_dst flag for pickle size related reasons IIRC, but without that flag you can't do datetime arithmetic that crosses DST boundaries correctly. It had a limitation, and it was documented, and we moved on because apart from this little wart it is a nice module to use and fixing it would have made my ZODB2 database bloat. And then I realized that I could store that single bit of information in the tzinfo instance, and went and implemented pytz because I was young and foolish. And then Gustavo released dateutils, which avoided the issue by avoiding the issue, and prompted me to get off my arse, fix the remaining tests and release it. Conversion was the larger issue, and to do that correctly the arithmetic needed to be fixed. I never thought I was fixing the arithmetic or implementing a particular style - I was just getting my test suite to pass (which is generated from every transition stored in the IANA nee Olson database). -- Stuart Bishop http://www.stuartbishop.net/ From tim.peters at gmail.com Mon Aug 24 17:04:18 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 24 Aug 2015 10:04:18 -0500 Subject: [Datetime-SIG] Implementing tzinfo for all valid datetimes (was Re: PEP-431/495) In-Reply-To: References: Message-ID: [Chris Angelico ] > tzdata can't accurately represent future timestamps anyway, as they > can change with barely a week's notice (cf Egypt and DST, and North > Korea moving half an hour away from Japan); Of course not. "Reasonable guess" means a guess based on all current knowledge. And that's what zoneinfo intends to supply with its TZ rules for instants beyond the explicit transitions. As time goes on, the TZ rules will need to be used for the present and the past too in _many_ cases, because they pretty obviously don't intend to generate any explicit transitions beyond 2037 ever again, except in cases where a TZ rule is inadequate. > the past, however, we should be able to be sure of. Oh no. Get the zoneinfo source and study it. They're very open about that there's nothing "sure" about the vast bulk of it (which is historical transitions). Note especially the "Errors in the tz database arise from many sources" section of the Theory file; e.g., just one of those: * Most of the pre-1970 data entries come from unreliable sources, often astrology books that lack citations and whose compilers evidently invented entries when the true facts were unknown, without reporting which entries were known and which were invented. These books often contradict each other or give implausible entries, and on the rare occasions when they are checked they are typically found to be incorrect. And they list 17 other significant sources of errors. For the vast bulk of historical times, the only real advantage of zoneinfo is to enable all clients to report the _same_ fictions. For recent and future times, zoneinfo has a crushing advantage: it's the only source of timezone info that's aggressively kept up-to-date by bona fide timezone wonks ;-) That's its real value - it could lose the historical stuff and be just as valuable to almost everyone. From stuart at stuartbishop.net Mon Aug 24 17:54:30 2015 From: stuart at stuartbishop.net (Stuart Bishop) Date: Mon, 24 Aug 2015 22:54:30 +0700 Subject: [Datetime-SIG] Implementing tzinfo for all valid datetimes (was Re: PEP-431/495) In-Reply-To: References: Message-ID: On 24 August 2015 at 14:05, Tim Peters wrote: > [Tim] >> Oops! Somewhere around 2037-2038 it apparently lost all knowledge of >> US/Eastern daylight time. I expect this is why: >> >> >>> ez._utc_transition_times[-1] >> datetime.datetime(2037, 11, 1, 6, 0) >> >> That is, the last transition it knows about is the end of daylight time in 2037. > ... >> >> Digging deeper, I don't think I can pin this on tzfile. The docs say >> that, if possible, a tzfile also contains a POSIX-TZ-style rule to be >> used for times beyond the last explicit transition instant. In the >> US/Eastern tzfile shipped with this version of pytz, that's: >> >> EST5EDT,M3.2.0,M11.1.0 >> >> So a "complete" wrapping of zoneinfo also requires implementing such >> rules when present. Yeah, I should really do that before 2037... > This appears to be the scoop, although I may be wrong about some: > when tzfile was first invented, like most other stuff at the time it > assumed the world would end before 2038 (the first year a signed > 32-bit int is too narrow to hold a UNIX(tm) seconds-since-1970 > timestamp). Values in a tzfile were all at most 4 bytes, zic > generated all transitions explicitly through the end of 2037, and that > was that. Not that the world would end. Just that we would be safely retired :) -- Stuart Bishop http://www.stuartbishop.net/ From alexander.belopolsky at gmail.com Mon Aug 24 19:16:15 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 24 Aug 2015 13:16:15 -0400 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 Message-ID: It is quite possible that we have a communication failure because I am not sufficiently familiar with the way pytz works. Let's use this new thread to discuss what is common and what is different between pytz and PEP 495 approaches to disambiguating folds and dealing with gaps. Let's keep any kind of arithmetic issues out of this thread and stay within the scope of PEP 495. On Mon, Aug 24, 2015 at 9:56 AM, Stuart Bishop wrote: > If a user calls localize(is_dst=None), AmbiguousTImeError and > NonExistantTimeError exceptions may be raised, but by default > exceptions are not raised. > >From your recent example, naive datetime(2004, 4, 4, 2) falls in US/Eastern spring-forward gap: >>> dt = datetime(2004, 4, 4, 2) Here is what I get using pytz.timezone: >>> EasternTZ = pytz.timezone('US/Eastern') >>> EasternTZ.localize(dt).isoformat() '2004-04-04T02:00:00-05:00' >>> EasternTZ.localize(dt, is_dst=-1).isoformat() '2004-04-04T02:00:00-04:00' >>> EasternTZ.localize(dt, is_dst=0).isoformat() '2004-04-04T02:00:00-05:00' >>> EasternTZ.localize(dt, is_dst=1).isoformat() '2004-04-04T02:00:00-04:00' >>> EasternTZ.localize(dt, is_dst=10).isoformat() '2004-04-04T02:00:00-04:00' >>> EasternTZ.localize(dt, is_dst=None).isoformat() Traceback (most recent call last): File "", line 1, in File ".../pytz/tzinfo.py", line 327, in localize raise NonExistentTimeError(dt) pytz.exceptions.NonExistentTimeError: 2004-04-04 02:00:00 Note that in all non-error cases, you get the invalid "02:00" time in the output. PEP 495 takes a different approach: >>> from test.datetimetester import Eastern2 >>> datetime(2004, 4, 4, 2, first=True, tzinfo=Eastern2).astimezone().isoformat() '2004-04-04T03:00:00-04:00' >>> datetime(2004, 4, 4, 2, first=False, tzinfo=Eastern2).astimezone().isoformat() '2004-04-04T01:00:00-05:00' A post-PEP 495 timezone conversion faced with a missing time is required to return a valid time. This is similar to the way C mktime works in most implementations. If you give it a struct tm representing a time from a DST gap - it will "normalize" it by changing tm_hour up or down depending on the tm_isdst value. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Aug 24 19:18:24 2015 From: guido at python.org (Guido van Rossum) Date: Mon, 24 Aug 2015 10:18:24 -0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: I'm still unclear on why pytz solves two problems (timeline arithmetic and a tzinfo database). What is the linkage between the two besides that you (Stuart) feels that both are important problems to solve? -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Mon Aug 24 19:44:07 2015 From: carl at oddbird.net (Carl Meyer) Date: Mon, 24 Aug 2015 11:44:07 -0600 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: <55DB57E7.20302@oddbird.net> On 08/24/2015 11:18 AM, Guido van Rossum wrote: > I'm still unclear on why pytz solves two problems (timeline arithmetic > and a tzinfo database). What is the linkage between the two besides that > you (Stuart) feels that both are important problems to solve? Didn't Stuart already answer this in his last email? "Grump. I always interpreted that documentation to mean that timezone conversions where *my* problem as the author of the tzinfo implementation. I thought it was a documented problem to be fixed if/when Python ever provided more complex tzinfo implementations, and one of the reasons it never did provide such implementations in the first place." "Classic behaviour as you describe it is a bug. It sounds ok when you state it as 'add one day to today and you get the same time tomorrow'. It does not sound ok when you state it as 'add one second to now and you will normally get now + 1 second, but sometimes you will get an instant further in the future, and sometimes you will get an instant in the past'." In other words, he just assumed that timeline arithmetic was his only reasonable option as the author of a useful, non-buggy tzinfo implementation. As a user of pytz, it was certainly what I expected, and I'd have considered "classic arithmetic" to be a bug (thanks to pytz, I avoided even knowing that was Python's default behavior until this thread), so I can't fault his assumption. I think the other linkage between the two is that pytz's "every tzinfo instance is fixed-offset" is the most natural way to solve the PEP-495 problem in the absence of PEP 495 and ensure that all datetime instances are unambiguous and valid. Faced with "I need to store this extra disambiguation bit in the tzinfo somehow, to clarify which of two offsets is intended when a timezone transitions from one offset to another", you can either store a boolean somewhere which is usually irrelevant and very hard to name sensibly, or you can "store" the flag by simply assigning a tzinfo instance which represents the specific offset you want (but also knows its full timezone rules, so it can "normalize" to a different offset when asked to). Once you choose the latter implementation (which you might reasonably choose even if you didn't care about arithmetic at all), Python gives you timeline arithmetic automatically. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From guido at python.org Mon Aug 24 20:02:49 2015 From: guido at python.org (Guido van Rossum) Date: Mon, 24 Aug 2015 11:02:49 -0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: <55DB57E7.20302@oddbird.net> References: <55DB57E7.20302@oddbird.net> Message-ID: On Mon, Aug 24, 2015 at 10:44 AM, Carl Meyer wrote: > On 08/24/2015 11:18 AM, Guido van Rossum wrote: > > I'm still unclear on why pytz solves two problems (timeline arithmetic > > and a tzinfo database). What is the linkage between the two besides that > > you (Stuart) feels that both are important problems to solve? > > Didn't Stuart already answer this in his last email? > > "Grump. I always interpreted that documentation to mean that timezone > conversions where *my* problem as the author of the tzinfo > implementation. I thought it was a documented problem to be fixed > if/when Python ever provided more complex tzinfo implementations, and > one of the reasons it never did provide such implementations in the > first place." > I guess he assumed wrong. We didn't consider the arithmetic we implemented broken at all. Our assumptions at the time were: - If you want timeline arithmetic you will obviously be using UTC (either POSIX timestamps or datetime instances with assumed or explicit UTC) - Implementing a tzinfo database is a huge project that we'd like to tackle at some point in the future (possibly as a 3rd party project) I'm still confused about what makes Stuart believe a tzinfo database must also change the arithmetic rules (especially since Gustavo's dateutils apparently gets along quite well without this). > "Classic behaviour as you describe it is a bug. It sounds ok when you > state it as 'add one day to today and you get the same time tomorrow'. > It does not sound ok when you state it as 'add one second to now and > you will normally get now + 1 second, but sometimes you will get an > instant further in the future, and sometimes you will get an instant > in the past'." > That sounds tautological to me -- "it's a bug because I find it buggy". Maybe the underlying reason is that to me, even a datetime with tzinfo does not refer to an instant -- it refers to something that's displayed on a clock. To me, arithmetic is a way of moving the hands of the clock. In other words, he just assumed that timeline arithmetic was his only > reasonable option as the author of a useful, non-buggy tzinfo > implementation. As a user of pytz, it was certainly what I expected, and > I'd have considered "classic arithmetic" to be a bug (thanks to pytz, I > avoided even knowing that was Python's default behavior until this > thread), so I can't fault his assumption. > But again that proves nothing. Of course if you're used to pytz's behavior you'll find the other behavior buggy. And it still does nothing to explain (to me) why the two need to be inextricably linked. > I think the other linkage between the two is that pytz's "every tzinfo > instance is fixed-offset" is the most natural way to solve the PEP-495 > problem in the absence of PEP 495 and ensure that all datetime instances > are unambiguous and valid. Again (as can be seen from the endless bickering between Alexander and myself about whether this is a bug or not) your view is colored by pytz's position. > Faced with "I need to store this extra > disambiguation bit in the tzinfo somehow, to clarify which of two > offsets is intended when a timezone transitions from one offset to > another", you can either store a boolean somewhere which is usually > irrelevant and very hard to name sensibly, or you can "store" the flag > by simply assigning a tzinfo instance which represents the specific > offset you want (but also knows its full timezone rules, so it can > "normalize" to a different offset when asked to). > But almost all instants (99.98%, according to Alexander) are not ambiguous and have no need for the disambiguation. If I every program an alarm to occur weekday at 7am I'd be most disturbed if an implementation botched the DST transition and woke me up at 6am one Monday morning. And yet that 7am is in a specific timezone (mine!). > Once you choose the latter implementation (which you might reasonably > choose even if you didn't care about arithmetic at all), Python gives > you timeline arithmetic automatically. You've proved nothing except that pytz's view is seductive. :-) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Aug 24 20:03:44 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 24 Aug 2015 13:03:44 -0500 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) Message-ID: Sorry, I can only make time for a single point now - but it's an important one, and seemingly subtle ;-) [Stuart Bishop ] > ... > Conversion was the larger issue, and to do that correctly the > arithmetic needed to be fixed. Let's be really clear on this: conversion has nothing to do with arithmetic. Well, yes, it does: doing conversion correctly is easiest using _classic_ arithmetic, which we already have. By "conversion" I mean conversion: .astimezone() and .fromutc(). You gave an example in another message today you called "conversion" that actually mixed conversion with addition, and that's not what I mean by "conversion". By "conversion", I only mean the conversion part of that example ;-) The problems with conversion have entirely to do with that .fromutc() today is incapable of setting a bit to say which UTC value was intended when the result in the destination zone is ambiguous, which in turn means .utcoffset() in the destination zone has no way to know either. That, and only that, is what makes conversions fail in some cases today. When tzinfos implementing PEP 495 are available, all conversions using those tzinfos will be fixed "by magic". Achieving that requires no change of any kind to arithmetic. To the contrary, this is all .astimezone(self, tz) does (in non-degenerate cases): myoffset = self.utcoffset() utc = (self - myoffset).replace(tzinfo=tz) # convert to UTC and paste on tz return tz.fromutc(utc) The subtraction (in the 2nd line) must use classic arithmetic; if it used timeline arithmetic instead, it could at best fall into an infinite regress (the line is _implementing_ conversion of `self` to UTC, but in timeline arithmetic subtraction would first try to convert `self` to UTC, which in turn ... Lennart bumped into this in various guises, which is why his PEP's implementation stalled - implementing "timeline arithmetic always" doesn't _help_ conversion, it gets in the way - classic arithmetic is the rock that stops it from being "turtles all the way down" ;-) ). No change to .astimezone() is needed either. Getting conversions right is entirely about: 1. .The utcoffset() in line 1 returning the correct result, which in turn is entirely about .utcoffset() knowing which value to return when `self` is in a fold. PEP 495 is enough to address that. and 2. .fromutc() setting `fold` correctly in the final (`return tz.fromutc(utc)`) line. PEP 495 is enough to address that too. Timeline arithmetic is a different issue. `fold` is necessary but not sufficient to get by-magic timeline arithmetic; `fold` is both necessary and sufficient to repair conversions. If and when optional timeline arithmetic is implemented, _then_ .the .astimezone() implementation will need to change, to _force_ use of classic arithmetic to convert to UTC. You should consider that to be an example of crucial code that indeed relies on classic arithmetic. You haven't bumped into anything like that in pytz because pytz did not change arithmetic: to "get the effect" of timeline arithmetic, users have to explicitly invoke a distinct .normalize() method in pytz. Nothing (whether in Python, other user code, 3rd-party libraries ...) relying on classic arithmetic _could_ be affected by that, so of course you never saw any problems. Lennart fell into a bottomless pit of pain when he did try changing default arithmetic. Which is why the default cannot be changed: Lennart already showed that changing it creates a bottomless pit of pain; indeed, it was so deep he never managed to climb out of it :-( From alexander.belopolsky at gmail.com Mon Aug 24 20:07:33 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 24 Aug 2015 14:07:33 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: <55DB57E7.20302@oddbird.net> Message-ID: On Mon, Aug 24, 2015 at 2:02 PM, Guido van Rossum wrote: > To me, arithmetic is a way of moving the hands of the clock. +1 - QOTW -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Mon Aug 24 20:20:58 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 24 Aug 2015 14:20:58 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: <55DB57E7.20302@oddbird.net> Message-ID: On Mon, Aug 24, 2015 at 2:02 PM, Guido van Rossum wrote: > > >> I think the other linkage between the two is that pytz's "every tzinfo >> instance is fixed-offset" is the most natural way to solve the PEP-495 >> problem in the absence of PEP 495 and ensure that all datetime instances >> are unambiguous and valid. > > > Again (as can be seen from the endless bickering between Alexander and > myself about whether this is a bug or not) your view is colored by pytz's > position. > I really regret that it came out as "bickering," because I am on Guido's side when it comes to a full DST aware tzinfo implementation. The fixed offset tzinfo implementation came as a compromise between those who did not want any concrete tzinfo implementations in the stdlib and those who wanted a full-featured LocalZone implementation. I still want to see LocalZone in stdlib, but to me it is only a worthwhile addition if it follows the original Guido/Tim design. If you want the aware instances that do timeline arithmetics - you already have two ways to do it: convert to UTC or convert to the "current" fixed offset timezone. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Mon Aug 24 20:29:57 2015 From: guido at python.org (Guido van Rossum) Date: Mon, 24 Aug 2015 11:29:57 -0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: <55DB57E7.20302@oddbird.net> Message-ID: On Mon, Aug 24, 2015 at 11:20 AM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > On Mon, Aug 24, 2015 at 2:02 PM, Guido van Rossum > wrote: > >> >> >>> I think the other linkage between the two is that pytz's "every tzinfo >>> instance is fixed-offset" is the most natural way to solve the PEP-495 >>> problem in the absence of PEP 495 and ensure that all datetime instances >>> are unambiguous and valid. >> >> >> Again (as can be seen from the endless bickering between Alexander and >> myself about whether this is a bug or not) your view is colored by pytz's >> position. >> > > I really regret that it came out as "bickering," because I am on Guido's > side when it comes to a full DST aware tzinfo implementation. The fixed > offset tzinfo implementation came as a compromise between those who did not > want any concrete tzinfo implementations in the stdlib and those who wanted > a full-featured LocalZone implementation. > > I still want to see LocalZone in stdlib, but to me it is only a > worthwhile addition if it follows the original Guido/Tim design. If you > want the aware instances that do timeline arithmetics - you already have > two ways to do it: convert to UTC or convert to the "current" fixed offset > timezone. > Excellent. This makes me a little less worried about the eventual outcome of this discussion. (And so does Tim's latest response to Stuart.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Mon Aug 24 20:30:31 2015 From: carl at oddbird.net (Carl Meyer) Date: Mon, 24 Aug 2015 12:30:31 -0600 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: <55DB57E7.20302@oddbird.net> Message-ID: <55DB62C7.2030407@oddbird.net> On 08/24/2015 12:02 PM, Guido van Rossum wrote: > Maybe the underlying reason is that to me, even a datetime with tzinfo > does not refer to an instant -- it refers to something that's displayed > on a clock. To me, arithmetic is a way of moving the hands of the clock. Yes, I think that's the basis of the differing views here. Clocks don't show timezones, so if I just wanted to "move the hands of the clock," I'd use a naive datetime, which is the proper representation for "something that is displayed on the clock." A datetime with a tzinfo _does_ correspond to an instant (even if you don't want to think of it that way), so to some of us it is confusing if it occasionally behaves as if it does not. > In other words, he just assumed that timeline arithmetic was his only > reasonable option as the author of a useful, non-buggy tzinfo > implementation. As a user of pytz, it was certainly what I expected, and > I'd have considered "classic arithmetic" to be a bug (thanks to pytz, I > avoided even knowing that was Python's default behavior until this > thread), so I can't fault his assumption. > > > But again that proves nothing. Of course if you're used to pytz's > behavior you'll find the other behavior buggy. And it still does nothing > to explain (to me) why the two need to be inextricably linked. It was the behavior I expected when I _first picked up_ pytz, and if I hadn't gotten it from pytz, I'd have -- well, I'd have continued to do store datetimes internally in UTC and do arithmetic in UTC, which is what I do and recommend to others. But I'd have considered the possibility of naive arithmetic with aware datetimes as an unnecessary attractive nuisance and bug magnet. > I think the other linkage between the two is that pytz's "every tzinfo > instance is fixed-offset" is the most natural way to solve the PEP-495 > problem in the absence of PEP 495 and ensure that all datetime instances > are unambiguous and valid. > > > Again (as can be seen from the endless bickering between Alexander and > myself about whether this is a bug or not) your view is colored by > pytz's position. I think the argument about whether it's a bug has been unenlightening, because it of course depends on what you want (and how you see datetimes, as described above). I certainly think that pytz's need for `normalize()` calls is most unfortunate (and I know Stuart agrees with that). But it preserves the most important thing (from my perspective), which is that the resulting datetime always corresponds to the right instant, even if it's got the offset wrong until you normalize it. Given the limited options Stuart had, I think that was the best choice available for my use cases (and apparently his). > Faced with "I need to store this extra > disambiguation bit in the tzinfo somehow, to clarify which of two > offsets is intended when a timezone transitions from one offset to > another", you can either store a boolean somewhere which is usually > irrelevant and very hard to name sensibly, or you can "store" the flag > by simply assigning a tzinfo instance which represents the specific > offset you want (but also knows its full timezone rules, so it can > "normalize" to a different offset when asked to). > > > But almost all instants (99.98%, according to Alexander) are not > ambiguous and have no need for the disambiguation. Of course. But that can easily be an argument in favor of pytz's implementation choice. Why give every tzinfo a boolean flag that is worse-than-useless (because its presence and naming is confusing) in the 99.98% of cases when it's not needed, when you can instead give every tzinfo an unambiguous offset and eliminate the problem? If I every program an > alarm to occur weekday at 7am I'd be most disturbed if an implementation > botched the DST transition and woke me up at 6am one Monday morning. And > yet that 7am is in a specific timezone (mine!). This example is meaningless, because "program an alarm to occur every weekday at 7am" is not a valid use case for any type of datetime duration arithmetic at all. It's a use case for a period recurrence, which is an entirely different beast (e.g. dateutil.rrule). It's the same type of operation as "notify me every second Saturday," not as "what will the time be in 15 seconds". Programming that use case using "add 86400 seconds to the time my alarm went off yesterday" is certainly a possible newbie mistake someone might make who hasn't worked with timezones or DST before (and who also forgot the "weekdays only" requirement), but it's a mistake that should be fixed by pointing them to a recurrence library, not by having aware datetimes use naive arithmetic. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From tim.peters at gmail.com Mon Aug 24 20:36:28 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 24 Aug 2015 13:36:28 -0500 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: References: Message-ID: [ISAAC J SCHWABACHER ] > People keep talking about infinite regress, but I don't think > there's a problem there at all because in the internals of the > time zone conversion, you can strip off the time zone before > doing arithmetic. Am I missing something here? Yes: magic doesn't exist in the real world ;-) All places relying on classic arithmetic _could_ be repaired if timeline arithmetic were made mandatory, but finding them would require an exhaustive expert audit of all instances of aware-datetime arithmetic in all the world's code. It only requires finding one such instance to prove that code _would_ break if the default were changed. I didn't look for one - I just happened to notice one in .astimezone(). I have no idea how many such places Lennart found across the 2+ years PEP 431 was struggling to finish (but I also have no idea about the _details_ of the problems Lennart hit). From guido at python.org Mon Aug 24 20:50:16 2015 From: guido at python.org (Guido van Rossum) Date: Mon, 24 Aug 2015 11:50:16 -0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: <55DB62C7.2030407@oddbird.net> References: <55DB57E7.20302@oddbird.net> <55DB62C7.2030407@oddbird.net> Message-ID: Let's just agree to disagree. I don't want to argue any more. On Mon, Aug 24, 2015 at 11:30 AM, Carl Meyer wrote: > On 08/24/2015 12:02 PM, Guido van Rossum wrote: > > Maybe the underlying reason is that to me, even a datetime with tzinfo > > does not refer to an instant -- it refers to something that's displayed > > on a clock. To me, arithmetic is a way of moving the hands of the clock. > > Yes, I think that's the basis of the differing views here. > > Clocks don't show timezones, so if I just wanted to "move the hands of > the clock," I'd use a naive datetime, which is the proper representation > for "something that is displayed on the clock." > > A datetime with a tzinfo _does_ correspond to an instant (even if you > don't want to think of it that way), so to some of us it is confusing if > it occasionally behaves as if it does not. > > > In other words, he just assumed that timeline arithmetic was his only > > reasonable option as the author of a useful, non-buggy tzinfo > > implementation. As a user of pytz, it was certainly what I expected, > and > > I'd have considered "classic arithmetic" to be a bug (thanks to > pytz, I > > avoided even knowing that was Python's default behavior until this > > thread), so I can't fault his assumption. > > > > > > But again that proves nothing. Of course if you're used to pytz's > > behavior you'll find the other behavior buggy. And it still does nothing > > to explain (to me) why the two need to be inextricably linked. > > It was the behavior I expected when I _first picked up_ pytz, and if I > hadn't gotten it from pytz, I'd have -- well, I'd have continued to do > store datetimes internally in UTC and do arithmetic in UTC, which is > what I do and recommend to others. But I'd have considered the > possibility of naive arithmetic with aware datetimes as an unnecessary > attractive nuisance and bug magnet. > > > I think the other linkage between the two is that pytz's "every > tzinfo > > instance is fixed-offset" is the most natural way to solve the > PEP-495 > > problem in the absence of PEP 495 and ensure that all datetime > instances > > are unambiguous and valid. > > > > > > Again (as can be seen from the endless bickering between Alexander and > > myself about whether this is a bug or not) your view is colored by > > pytz's position. > > I think the argument about whether it's a bug has been unenlightening, > because it of course depends on what you want (and how you see > datetimes, as described above). I certainly think that pytz's need for > `normalize()` calls is most unfortunate (and I know Stuart agrees with > that). But it preserves the most important thing (from my perspective), > which is that the resulting datetime always corresponds to the right > instant, even if it's got the offset wrong until you normalize it. Given > the limited options Stuart had, I think that was the best choice > available for my use cases (and apparently his). > > > Faced with "I need to store this extra > > disambiguation bit in the tzinfo somehow, to clarify which of two > > offsets is intended when a timezone transitions from one offset to > > another", you can either store a boolean somewhere which is usually > > irrelevant and very hard to name sensibly, or you can "store" the > flag > > by simply assigning a tzinfo instance which represents the specific > > offset you want (but also knows its full timezone rules, so it can > > "normalize" to a different offset when asked to). > > > > > > But almost all instants (99.98%, according to Alexander) are not > > ambiguous and have no need for the disambiguation. > > Of course. But that can easily be an argument in favor of pytz's > implementation choice. Why give every tzinfo a boolean flag that is > worse-than-useless (because its presence and naming is confusing) in the > 99.98% of cases when it's not needed, when you can instead give every > tzinfo an unambiguous offset and eliminate the problem? > > If I every program an > > alarm to occur weekday at 7am I'd be most disturbed if an implementation > > botched the DST transition and woke me up at 6am one Monday morning. And > > yet that 7am is in a specific timezone (mine!). > > This example is meaningless, because "program an alarm to occur every > weekday at 7am" is not a valid use case for any type of datetime > duration arithmetic at all. It's a use case for a period recurrence, > which is an entirely different beast (e.g. dateutil.rrule). It's the > same type of operation as "notify me every second Saturday," not as > "what will the time be in 15 seconds". > > Programming that use case using "add 86400 seconds to the time my alarm > went off yesterday" is certainly a possible newbie mistake someone might > make who hasn't worked with timezones or DST before (and who also forgot > the "weekdays only" requirement), but it's a mistake that should be > fixed by pointing them to a recurrence library, not by having aware > datetimes use naive arithmetic. > > Carl > > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon Aug 24 21:05:37 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 24 Aug 2015 12:05:37 -0700 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 In-Reply-To: References: Message-ID: <55DB6B01.8050606@stoneleaf.us> On 08/24/2015 10:16 AM, Alexander Belopolsky wrote:> PEP 495 takes a different approach: > >>> from test.datetimetester import Eastern2 > >>> datetime(2004, 4, 4, 2, first=True, tzinfo=Eastern2).astimezone().isoformat() > '2004-04-04T03:00:00-04:00' > >>> datetime(2004, 4, 4, 2, first=False, tzinfo=Eastern2).astimezone().isoformat() > '2004-04-04T01:00:00-05:00' > > A post-PEP 495 timezone conversion faced with a missing time is required to return > a valid time. This is similar to the way C mktime works in most implementations. > If you give it a struct tm representing a time from a DST gap - it will "normalize" > it by changing tm_hour up or down depending on the tm_isdst value. I would be much happier about this if: >>> datetime(2004, 4, 4, 2, first=None, tzinfo=Eastern2).astimezone().isoformat() Traceback (most recent call last): File "", line 1, in File ".../pytz/tzinfo.py", line 327, in localize raise NonExistentTimeError(dt) NonExistentTimeError: 2004-04-04 02:00:00 Giving the programmer an easier option to use if they want an exception. -- ~Ethan~ From alexander.belopolsky at gmail.com Mon Aug 24 21:14:40 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 24 Aug 2015 15:14:40 -0400 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 In-Reply-To: <55DB6B01.8050606@stoneleaf.us> References: <55DB6B01.8050606@stoneleaf.us> Message-ID: On Mon, Aug 24, 2015 at 3:05 PM, Ethan Furman wrote: > I would be much happier about this if: > > >>> datetime(2004, 4, 4, 2, first=None, > tzinfo=Eastern2).astimezone().isoformat() > Traceback (most recent call last): > File "", line 1, in > File ".../pytz/tzinfo.py", line 327, in localize > raise NonExistentTimeError(dt) > NonExistentTimeError: 2004-04-04 02:00:00 > > Giving the programmer an easier option to use if they want an exception. > Which of the steps you want to raise an exception: >>> dt = datetime(2004, 4, 4, 2, first=None, tzinfo=Eastern2) >>> ldt = dt.astimezone() or >>> ldt.isoformat() and why? The stack trace that you presented comes for "localize", but no such method is proposed in PEP 495. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ischwabacher at wisc.edu Mon Aug 24 20:24:16 2015 From: ischwabacher at wisc.edu (ISAAC J SCHWABACHER) Date: Mon, 24 Aug 2015 18:24:16 +0000 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: References: Message-ID: People keep talking about infinite regress, but I don't think there's a problem there at all because in the internals of the time zone conversion, you can strip off the time zone before doing arithmetic. Am I missing something here? ________________________________________ From: Datetime-SIG on behalf of Tim Peters Sent: Monday, August 24, 2015 13:03 To: Stuart Bishop Cc: datetime-sig Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) Sorry, I can only make time for a single point now - but it's an important one, and seemingly subtle ;-) [Stuart Bishop ] > ... > Conversion was the larger issue, and to do that correctly the > arithmetic needed to be fixed. Let's be really clear on this: conversion has nothing to do with arithmetic. Well, yes, it does: doing conversion correctly is easiest using _classic_ arithmetic, which we already have. By "conversion" I mean conversion: .astimezone() and .fromutc(). You gave an example in another message today you called "conversion" that actually mixed conversion with addition, and that's not what I mean by "conversion". By "conversion", I only mean the conversion part of that example ;-) The problems with conversion have entirely to do with that .fromutc() today is incapable of setting a bit to say which UTC value was intended when the result in the destination zone is ambiguous, which in turn means .utcoffset() in the destination zone has no way to know either. That, and only that, is what makes conversions fail in some cases today. When tzinfos implementing PEP 495 are available, all conversions using those tzinfos will be fixed "by magic". Achieving that requires no change of any kind to arithmetic. To the contrary, this is all .astimezone(self, tz) does (in non-degenerate cases): myoffset = self.utcoffset() utc = (self - myoffset).replace(tzinfo=tz) # convert to UTC and paste on tz return tz.fromutc(utc) The subtraction (in the 2nd line) must use classic arithmetic; if it used timeline arithmetic instead, it could at best fall into an infinite regress (the line is _implementing_ conversion of `self` to UTC, but in timeline arithmetic subtraction would first try to convert `self` to UTC, which in turn ... Lennart bumped into this in various guises, which is why his PEP's implementation stalled - implementing "timeline arithmetic always" doesn't _help_ conversion, it gets in the way - classic arithmetic is the rock that stops it from being "turtles all the way down" ;-) ). No change to .astimezone() is needed either. Getting conversions right is entirely about: 1. .The utcoffset() in line 1 returning the correct result, which in turn is entirely about .utcoffset() knowing which value to return when `self` is in a fold. PEP 495 is enough to address that. and 2. .fromutc() setting `fold` correctly in the final (`return tz.fromutc(utc)`) line. PEP 495 is enough to address that too. Timeline arithmetic is a different issue. `fold` is necessary but not sufficient to get by-magic timeline arithmetic; `fold` is both necessary and sufficient to repair conversions. If and when optional timeline arithmetic is implemented, _then_ .the .astimezone() implementation will need to change, to _force_ use of classic arithmetic to convert to UTC. You should consider that to be an example of crucial code that indeed relies on classic arithmetic. You haven't bumped into anything like that in pytz because pytz did not change arithmetic: to "get the effect" of timeline arithmetic, users have to explicitly invoke a distinct .normalize() method in pytz. Nothing (whether in Python, other user code, 3rd-party libraries ...) relying on classic arithmetic _could_ be affected by that, so of course you never saw any problems. Lennart fell into a bottomless pit of pain when he did try changing default arithmetic. Which is why the default cannot be changed: Lennart already showed that changing it creates a bottomless pit of pain; indeed, it was so deep he never managed to climb out of it :-( _______________________________________________ Datetime-SIG mailing list Datetime-SIG at python.org https://mail.python.org/mailman/listinfo/datetime-sig The PSF Code of Conduct applies to this mailing list: https://www.python.org/psf/codeofconduct/ From ischwabacher at wisc.edu Mon Aug 24 21:30:31 2015 From: ischwabacher at wisc.edu (ISAAC J SCHWABACHER) Date: Mon, 24 Aug 2015 19:30:31 +0000 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: References: Message-ID: [ijs] > > People keep talking about infinite regress, but I don't think > > there's a problem there at all because in the internals of the > > time zone conversion, you can strip off the time zone before > > doing arithmetic. Am I missing something here? [Tim Peters] > Yes: magic doesn't exist in the real world ;-) All places relying on > classic arithmetic _could_ be repaired if timeline arithmetic were > made mandatory, but finding them would require an exhaustive expert > audit of all instances of aware-datetime arithmetic in all the world's > code. It only requires finding one such instance to prove that code > _would_ break if the default were changed. I didn't look for one - I > just happened to notice one in .astimezone(). I have no idea how > many such places Lennart found across the 2+ years PEP 431 was > struggling to finish (but I also have no idea about the _details_ of > the problems Lennart hit). But here you're talking about the backward compatibility problem, not the infinite regress problem. I guess I'll have to try to implement it and find out where the headaches are myself. It's also worth noting that backward compatibility cuts both ways: because Stuart Bishop has chosen to implement timeline arithmetic, any stdlib inclusion of pytz will have to support timeline arithmetic in some form in order to maintain backward compatibility with the package itself. I see how PEP 495 makes it possible to convert datetimes correctly in all cases, but I don't see how it makes it possible to implement time zones that will be pytz-compatible without continuing to require the localize/normalize dance. ijs From alexander.belopolsky at gmail.com Mon Aug 24 21:45:56 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 24 Aug 2015 15:45:56 -0400 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: References: Message-ID: On Mon, Aug 24, 2015 at 3:30 PM, ISAAC J SCHWABACHER wrote: > It's also worth noting that backward compatibility cuts both ways: because > Stuart Bishop has chosen to implement timeline arithmetic, any stdlib > inclusion of pytz will have to support timeline arithmetic in some form in > order to maintain backward compatibility with the package itself. I see how > PEP 495 makes it possible to convert datetimes correctly in all cases, but > I don't see how it makes it possible to implement time zones that will be > pytz-compatible without continuing to require the localize/normalize dance. The "localize/normalize dance" is part of the backward compatibility guarantee that PEP 495 makes to both datetime and (released) pytz users. The code that currently works correctly with "localize/normalize dance" cannot be made to work correctly without it because that will necessarily break some other code that currently works correctly without "localize/normalize dance." Stuart has an option of breaking with backward compatibility and releasing a post-PEP 495 pytz2, but we cannot release Python 3.6 that will break every program that uses current versions of pytz or datetutils no matter how strongly we believe that one of the packages is more "correct" than another. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon Aug 24 21:54:03 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 24 Aug 2015 12:54:03 -0700 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 In-Reply-To: References: <55DB6B01.8050606@stoneleaf.us> Message-ID: <55DB765B.8020702@stoneleaf.us> On 08/24/2015 12:14 PM, Alexander Belopolsky wrote: > On Mon, Aug 24, 2015 at 3:05 PM, Ethan Furman wrote: >> I would be much happier about this if: >> >> >>> datetime(2004, 4, 4, 2, first=None, tzinfo=Eastern2).astimezone().isoformat() >> Traceback (most recent call last): > File "", line 1, in > File ".../pytz/tzinfo.py", line 327, in localize > raise NonExistentTimeError(dt) > NonExistentTimeError: 2004-04-04 02:00:00 > > Giving the programmer an easier option to use if they want an exception. > > > Which of the steps you want to raise an exception: > > >>> dt = datetime(2004, 4, 4, 2, first=None, tzinfo=Eastern2) This one. > and why? Because Python shouldn't be guessing. 'datetime' doesn't know how the programmer got those numbers, and so can't know whether subtraction was used and 1am is the right answer, or addition was used and 3am is the right answer. > The stack trace that you presented comes for "localize", but no such method > is proposed in PEP 495. Sorry, I copied the stack trace with only slight modification -- the import part is the NonExistantTimeError (or AmbiguousTimeError for the opposite scenario). -- ~Ethan~ From tim.peters at gmail.com Mon Aug 24 21:54:09 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 24 Aug 2015 14:54:09 -0500 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: References: Message-ID: [ijs] > But here you're talking about the backward compatibility > problem, not the infinite regress problem. If a change causes an infinite regress in formerly working code, then it's a "backward compatibility" problem too. If you write code from the start with the change in mind, of course there's no _inherent_ "infinite regress problem". > I guess I'll have to try to implement it and find out > where the headaches are myself. Have fun ;-) > It's also worth noting that backward compatibility cuts both > ways: because Stuart Bishop has chosen to implement > timeline arithmetic, any stdlib inclusion of pytz will have to > support timeline arithmetic in some form in order to > maintain backward compatibility with the package itself. We _could_ include pytz exactly the way it is today, without changing anything (else) in Python or pytz. There are lots of ways to deal with this. We could also say, more like dateutil did, that Python finds the Olson database useful but wants _nothing_ to do with timeline arithmetic, not ever. Etc. In any case, there is no active PEP at the moment proposing to add any new timezone implementations to Python. One step at a time. > I see how PEP 495 makes it possible to convert datetimes correctly > in all cases, Whew! You may well be only the second person to grasp this - thanks :-) > but I don't see how it makes it possible to implement > time zones that will be pytz-compatible without continuing to require > the localize/normalize dance. Indeed it doesn't - nor does it preclude it. PEP 495 says nothing whatsoever about timeline arithmetic, or about any other feature of any external timezone library except to the extent it adds new requirements on their Python-specifed tzinfo methods (chiefly .utcoffset() and .fromutc()). If pytz plays along with 495, then there's a new baseline to start from. For example, the current (I think) ptyz docs say: """ Converting between timezones also needs special attention. We also need to use the ``normalize()`` method to ensure the conversion is correct. """ That dance "should" no longer be necessary. But I don't know - perhaps pytz will continue to need it anyway. That's up to Stuart, not to me. From alexander.belopolsky at gmail.com Mon Aug 24 22:23:15 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 24 Aug 2015 16:23:15 -0400 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 In-Reply-To: <55DB765B.8020702@stoneleaf.us> References: <55DB6B01.8050606@stoneleaf.us> <55DB765B.8020702@stoneleaf.us> Message-ID: On Mon, Aug 24, 2015 at 3:54 PM, Ethan Furman wrote: > Which of the steps you want to raise an exception: >> >> >>> dt = datetime(2004, 4, 4, 2, first=None, tzinfo=Eastern2) >> > > This one. and what about this one? >>> dt = time(2004, 4, 4, 2, first=None, tzinfo=Eastern2) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Mon Aug 24 22:24:26 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 24 Aug 2015 16:24:26 -0400 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 In-Reply-To: References: <55DB6B01.8050606@stoneleaf.us> <55DB765B.8020702@stoneleaf.us> Message-ID: On Mon, Aug 24, 2015 at 4:23 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > On Mon, Aug 24, 2015 at 3:54 PM, Ethan Furman wrote: > >> Which of the steps you want to raise an exception: >>> >>> >>> dt = datetime(2004, 4, 4, 2, first=None, tzinfo=Eastern2) >>> >> >> This one. > > > and what about this one? > > >>> dt = time(2004, 4, 4, 2, first=None, tzinfo=Eastern2) > Sorry, did not delete enough: >>> dt = time(2, first=None, tzinfo=Eastern2) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Mon Aug 24 22:44:25 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 24 Aug 2015 16:44:25 -0400 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 In-Reply-To: References: <55DB6B01.8050606@stoneleaf.us> <55DB765B.8020702@stoneleaf.us> Message-ID: On Mon, Aug 24, 2015 at 4:28 PM, ISAAC J SCHWABACHER wrote: > [ijs] > I *really* hope the answer to this one is, "don't do that". > That's not an option because people already *do* [1] that and they won't stop. Neither they will stop using datetime.combine() [2] or datetime.replace() [3] or tolerate if those methods start raising exceptions. I am giving examples from datetime.py which in theory we can change, but what we use in datetime.py is likely to be used in user code as well. [1]: https://hg.python.org/cpython/file/v3.5.0rc1/Lib/datetime.py#l1476 [2]: https://hg.python.org/cpython/file/v3.5.0rc1/Lib/datetime.py#l1734 [3]: https://hg.python.org/cpython/file/v3.5.0rc1/Lib/datetime.py#l1540 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Mon Aug 24 22:55:31 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 24 Aug 2015 13:55:31 -0700 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 In-Reply-To: References: <55DB6B01.8050606@stoneleaf.us> <55DB765B.8020702@stoneleaf.us> Message-ID: <55DB84C3.3010904@stoneleaf.us> On 08/24/2015 01:44 PM, Alexander Belopolsky wrote: > On Mon, Aug 24, 2015 at 4:28 PM, ISAAC J SCHWABACHER wrote: > >> [ijs] >> I *really* hope the answer to this one is, "don't do that". > > That's not an option because people already *do* [1] that and they won't stop. > Neither they will stop using datetime.combine() [2] or datetime.replace() [3] > or tolerate if those methods start raising exceptions. If the default is True (or False), then this won't be a problem. It will only be None when explicitly asked for. `time` can just store the flag, and when it is combined with a date the flag should be checked and if None and the resulting datetime doesn't exist or is ambiguous an exception can be raised. -- ~Ethan~ From alexander.belopolsky at gmail.com Mon Aug 24 23:10:43 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 24 Aug 2015 17:10:43 -0400 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 In-Reply-To: <55DB84C3.3010904@stoneleaf.us> References: <55DB6B01.8050606@stoneleaf.us> <55DB765B.8020702@stoneleaf.us> <55DB84C3.3010904@stoneleaf.us> Message-ID: On Mon, Aug 24, 2015 at 4:55 PM, Ethan Furman wrote: > On 08/24/2015 01:44 PM, Alexander Belopolsky wrote: > >> On Mon, Aug 24, 2015 at 4:28 PM, ISAAC J SCHWABACHER wrote: >> >> [ijs] >>> I *really* hope the answer to this one is, "don't do that". >>> >> >> That's not an option because people already *do* [1] that and they won't >> stop. >> Neither they will stop using datetime.combine() [2] or datetime.replace() >> [3] >> or tolerate if those methods start raising exceptions. >> > > If the default is True (or False), then this won't be a problem. It will > only be None when explicitly asked for. > I addressed [1] three reasons why people want to have the third value for the fold index in the recent version of the PEP. Let me just note that the three reasons are mutually exclusive: for example, the first and last call for different defaults. I suggest that the proponents of the fold=None/fold=-1 option first agree on one specific behavior that they want and then we consider the virtues of such behavior (if any). [1]: https://www.python.org/dev/peps/pep-0495/#are-two-values-enough -------------- next part -------------- An HTML attachment was scrubbed... URL: From ischwabacher at wisc.edu Mon Aug 24 23:28:11 2015 From: ischwabacher at wisc.edu (ISAAC J SCHWABACHER) Date: Mon, 24 Aug 2015 21:28:11 +0000 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 In-Reply-To: <55DB84C3.3010904@stoneleaf.us> References: <55DB6B01.8050606@stoneleaf.us> <55DB765B.8020702@stoneleaf.us> <55DB84C3.3010904@stoneleaf.us> Message-ID: [ijs] > > > I *really* hope the answer to this one is, "don't do that". [Alexander Belopolsky] > > That's not an option because people already *do* [1] that and they won't stop. > > Neither they will stop using datetime.combine() [2] or datetime.replace() [3] > > or tolerate if those methods start raising exceptions. [Ethan Furman] > If the default is True (or False), then this won't be a problem. It will only be None when explicitly asked for. > > `time` can just store the flag, and when it is combined with a date the flag should be checked and if None and the resulting datetime doesn't exist or is ambiguous an exception can be raised. A time with a non-constant-offset tzinfo is always ambiguous, and can have an arbitrary number of possible offsets. There are several time zones with at least three possible offsets for a given time in the last 10 years. How on earth do you define the meaning of a time with a non-constant tzinfo attached? Or does it only mean something when it's recombined with a date? ijs From ethan at stoneleaf.us Mon Aug 24 23:32:22 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 24 Aug 2015 14:32:22 -0700 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 In-Reply-To: References: <55DB6B01.8050606@stoneleaf.us> <55DB765B.8020702@stoneleaf.us> <55DB84C3.3010904@stoneleaf.us> Message-ID: <55DB8D66.9020107@stoneleaf.us> On 08/24/2015 02:10 PM, Alexander Belopolsky wrote: > I addressed three reasons why people want to have the third value for the fold > index in the recent version of the PEP. Let me just note that the three reasons > are mutually exclusive: for example, the first and last call for different defaults. > I suggest that the proponents of the fold=None/fold=-1 option first agree on one > specific behavior that they want and then we consider the virtues of such behavior > (if any). From the PEP: > Moreover, raising an error in the problem cases is only one of many possible > solutions. An interactive program can ask the user for additional input, while > a server process may log a warning and take an appropriate default action. We > cannot possibly provide functions for all possible user requirements, but this > PEP provides the means to implement any desired behavior in a few lines of code. It is my contention that library code (such as datetime) should raise exceptions when something exceptional happens, and the program code can then handle it appropriately for that program: # interactive program try: get_a_datetime_from_user(ltdf=None) except AmbiguousTimeError, NonExistentTimeError: get_clarification_from_user() # server program try: get_a_datetime_from_somewhere(ltdf=None) except AmbiguousTimeError, NonExistentTimeError: log(weird_time_error) use_default() -- ~Ethan~ From ethan at stoneleaf.us Mon Aug 24 23:35:07 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 24 Aug 2015 14:35:07 -0700 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 In-Reply-To: References: <55DB6B01.8050606@stoneleaf.us> <55DB765B.8020702@stoneleaf.us> <55DB84C3.3010904@stoneleaf.us> Message-ID: <55DB8E0B.2050500@stoneleaf.us> On 08/24/2015 02:28 PM, ISAAC J SCHWABACHER wrote: > [ijs] >>>> I *really* hope the answer to this one is, "don't do that". > > [Alexander Belopolsky] >>> That's not an option because people already *do* [1] that and they won't stop. >>> Neither they will stop using datetime.combine() [2] or datetime.replace() [3] >>> or tolerate if those methods start raising exceptions. > > [Ethan Furman] >> If the default is True (or False), then this won't be a problem. It will only >> be None when explicitly asked for. >> >> `time` can just store the flag, and when it is combined with a date the flag >> should be checked and if None and the resulting datetime doesn't exist or is >> ambiguous an exception can be raised. > > A time with a non-constant-offset tzinfo is always ambiguous, and can have an > arbitrary number of possible offsets. There are several time zones with at least > three possible offsets for a given time in the last 10 years. How on earth do > you define the meaning of a time with a non-constant tzinfo attached? Or does it > only mean something when it's recombined with a date? I hope the only way I would use a plain time is for today (whichever day 'today' happens to be), in which case having a tzinfo is still helpful for knowing what time it is somewhere else. Which is still a buggy proposition on days involving time switches. -- ~Ethan~ From alexander.belopolsky at gmail.com Mon Aug 24 23:45:50 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 24 Aug 2015 17:45:50 -0400 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 In-Reply-To: <55DB8D66.9020107@stoneleaf.us> References: <55DB6B01.8050606@stoneleaf.us> <55DB765B.8020702@stoneleaf.us> <55DB84C3.3010904@stoneleaf.us> <55DB8D66.9020107@stoneleaf.us> Message-ID: On Mon, Aug 24, 2015 at 5:32 PM, Ethan Furman wrote: > It is my contention that library code (such as datetime) should raise > exceptions when something exceptional happens, and the program code can > then handle it appropriately for that program: > ... > > # server program > try: > get_a_datetime_from_somewhere(ltdf=None) > except AmbiguousTimeError, NonExistentTimeError: > log(weird_time_error) > use_default() > I've been criticized on this list for using python-resembling pseudo-code, so I will assume that what you show is an actual Python code. In this case, get_a_datetime_from_somewhere and use_default lines have no visible effect on the program and I have no guess for what they are supposed to do internally. I do show in the PEP [1] how a function that raises an error on ambiguous/missing time can be written. [1]: https://www.python.org/dev/peps/pep-0495/#strict-invalid-time-checking -------------- next part -------------- An HTML attachment was scrubbed... URL: From ischwabacher at wisc.edu Mon Aug 24 23:46:56 2015 From: ischwabacher at wisc.edu (ISAAC J SCHWABACHER) Date: Mon, 24 Aug 2015 21:46:56 +0000 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 In-Reply-To: <55DB8E0B.2050500@stoneleaf.us> References: <55DB6B01.8050606@stoneleaf.us> <55DB765B.8020702@stoneleaf.us> <55DB84C3.3010904@stoneleaf.us> <55DB8E0B.2050500@stoneleaf.us> Message-ID: > > [ijs] > >>>> I *really* hope the answer to this one is, "don't do that". > > > > [Alexander Belopolsky] > >>> That's not an option because people already *do* [1] that and they won't stop. > >>> Neither they will stop using datetime.combine() [2] or datetime.replace() [3] > >>> or tolerate if those methods start raising exceptions. > > > > [Ethan Furman] > >> If the default is True (or False), then this won't be a problem. It will only > >> be None when explicitly asked for. > >> > >> `time` can just store the flag, and when it is combined with a date the flag > >> should be checked and if None and the resulting datetime doesn't exist or is > >> ambiguous an exception can be raised. > > > > A time with a non-constant-offset tzinfo is always ambiguous, and can have an > > arbitrary number of possible offsets. There are several time zones with at least > > three possible offsets for a given time in the last 10 years. How on earth do > > you define the meaning of a time with a non-constant tzinfo attached? Or does it > > only mean something when it's recombined with a date? > > I hope the only way I would use a plain time is for today (whichever day 'today' happens to be), in which case having a tzinfo is still helpful for knowing what time it is somewhere else. Which is > still a buggy proposition on days involving time switches. Sounds like "don't do that" to me. ijs From ethan at stoneleaf.us Mon Aug 24 23:58:56 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 24 Aug 2015 14:58:56 -0700 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 In-Reply-To: References: <55DB6B01.8050606@stoneleaf.us> <55DB765B.8020702@stoneleaf.us> <55DB84C3.3010904@stoneleaf.us> <55DB8D66.9020107@stoneleaf.us> Message-ID: <55DB93A0.1040602@stoneleaf.us> On 08/24/2015 02:45 PM, Alexander Belopolsky wrote: > On Mon, Aug 24, 2015 at 5:32 PM, Ethan Furman wrote: > >> It is my contention that library code (such as datetime) should raise exceptions >> when something exceptional happens, and the program code can then handle it >> appropriately for that program: > > ... > > > # server program > try: > get_a_datetime_from_somewhere(ltdf=None) > except AmbiguousTimeError, NonExistentTimeError: > log(weird_time_error) > use_default() > > > I've been criticized on this list for using python-resembling pseudo-code, so I will > assume that what you show is an actual Python code. In this case, > `get_a_datetime_from_somewhere` and `use_default` lines have no visible effect on the > program and I have no guess for what they are supposed to do internally. You're right, I was sloppy. How's this: try: some_date = get_a_datetime_from_somewhere(ltdf=None) except AmbiguousTimeError, NonExistentTimeError as e: log(weird_time_error, e) some_date = use_default(e.datetime) and the datetime attribute of e would be the attempted datetime with the flag set to None (so it is either still ambiguous, or still doesn't exist), and then use_default() would do something with it. > I do show in the PEP how a function that raises an error on ambiguous/missing > time can be written. Writing functions to work around short-comings is not how I like to spend my time (see my contention above about library code). If we do go the route of not raising exceptions we should just add your function to the datetime module and save people from mis-copying it. -- ~Ethan~ From alexander.belopolsky at gmail.com Tue Aug 25 00:18:23 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 24 Aug 2015 18:18:23 -0400 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 In-Reply-To: <55DB93A0.1040602@stoneleaf.us> References: <55DB6B01.8050606@stoneleaf.us> <55DB765B.8020702@stoneleaf.us> <55DB84C3.3010904@stoneleaf.us> <55DB8D66.9020107@stoneleaf.us> <55DB93A0.1040602@stoneleaf.us> Message-ID: On Mon, Aug 24, 2015 at 5:58 PM, Ethan Furman wrote: > try: > some_date = get_a_datetime_from_somewhere(ltdf=None) > except AmbiguousTimeError, NonExistentTimeError as e: > log(weird_time_error, e) > some_date = use_default(e.datetime) > > .. > Writing functions to work around short-comings is not how I like to spend > my time (see my contention above about library code). > Neither do I. That's why in my code you are more likely to see some_date = get_a_datetime_from_somewhere() instead of the wall of text that you presented. As you can imagine, I will not be happy facing a prospect of adding try:.. except's everywhere to port my code to Python 3.6. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Tue Aug 25 00:22:12 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 24 Aug 2015 18:22:12 -0400 Subject: [Datetime-SIG] Post-PEP 495 ideas Was: PEP-431/495 Message-ID: On Mon, Aug 24, 2015 at 5:58 PM, Ethan Furman wrote: > If we do go the route of not raising exceptions we should just add your > function to the datetime module and save people from mis-copying it. We will have plenty of time discussing this and other ideas after PEP 495 is accepted and implemented. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Tue Aug 25 00:40:36 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Mon, 24 Aug 2015 15:40:36 -0700 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 In-Reply-To: References: <55DB6B01.8050606@stoneleaf.us> <55DB765B.8020702@stoneleaf.us> <55DB84C3.3010904@stoneleaf.us> <55DB8D66.9020107@stoneleaf.us> <55DB93A0.1040602@stoneleaf.us> Message-ID: <55DB9D64.4030806@stoneleaf.us> On 08/24/2015 03:18 PM, Alexander Belopolsky wrote: > On Mon, Aug 24, 2015 at 5:58 PM, Ethan Furman wrote: > > try: > some_date = get_a_datetime_from_somewhere(ltdf=None) > except AmbiguousTimeError, NonExistentTimeError as e: > log(weird_time_error, e) > some_date = use_default(e.datetime) > > .. >> Writing functions to work around short-comings is not how I like to spend >> my time (see my contention above about library code). > > > Neither do I. That's why in my code you are more likely to see > > some_date = get_a_datetime_from_somewhere() > > instead of the wall of text that you presented. As you can imagine, I will > not be happy facing a prospect of adding try:.. except's everywhere to port > my code to Python 3.6. Why would you have to? I am not suggesting that None be the default, simply that it be available. -- ~Ethan~ From ischwabacher at wisc.edu Mon Aug 24 23:58:27 2015 From: ischwabacher at wisc.edu (ISAAC J SCHWABACHER) Date: Mon, 24 Aug 2015 21:58:27 +0000 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: References: Message-ID: [ijs] > > I guess I'll have to try to implement it and find out > > where the headaches are myself. [Tim Peters] > Have fun ;-) Always! :) > > It's also worth noting that backward compatibility cuts both > > ways: because Stuart Bishop has chosen to implement > > timeline arithmetic, any stdlib inclusion of pytz will have to > > support timeline arithmetic in some form in order to > > maintain backward compatibility with the package itself. > > We _could_ include pytz exactly the way it is today, without changing > anything (else) in Python or pytz. There are lots of ways to deal > with this. We could also say, more like dateutil did, that Python > finds the Olson database useful but wants _nothing_ to do with > timeline arithmetic, not ever. Etc. > > In any case, there is no active PEP at the moment proposing to add any > new timezone implementations to Python. One step at a time. PEP 431 did. But I agree, PEP 495 is a small and straightforward step in the right direction. Though it's not looking like the bikeshed is going to get painted my color. :) > > I see how PEP 495 makes it possible to convert datetimes correctly > > in all cases, > > Whew! You may well be only the second person to grasp this - thanks :-) Wait... was this ever in contention? > > but I don't see how it makes it possible to implement > > time zones that will be pytz-compatible without continuing to require > > the localize/normalize dance. > > Indeed it doesn't - nor does it preclude it. PEP 495 says nothing > whatsoever about timeline arithmetic, or about any other feature of > any external timezone library except to the extent it adds new > requirements on their Python-specifed tzinfo methods (chiefly > .utcoffset() and .fromutc()). > > If pytz plays along with 495, then there's a new baseline to start > from. For example, the current (I think) ptyz docs say: > > """ > Converting between timezones also needs special attention. We also need > to use the ``normalize()`` method to ensure the conversion is correct. > """ > > That dance "should" no longer be necessary. But I don't know - > perhaps pytz will continue to need it anyway. That's up to Stuart, > not to me. I assume pytz is committed for backward compatibility to timeline arithmetic for datetime subtraction, which means it can't afford to change its "one tz instance per offset" design as long as datetime arithmetic ignores the time zone when it's identical. There's no hook for pytz to hang its behavior on, and PEP 495 doesn't give it one. On the upside, it does look like there's a way to foil the dreaded self check in datetime_astimezone and make dt.astimezone(tz) work as intended. ijs From carl at oddbird.net Tue Aug 25 01:01:55 2015 From: carl at oddbird.net (Carl Meyer) Date: Mon, 24 Aug 2015 17:01:55 -0600 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: References: Message-ID: <55DBA263.1010407@oddbird.net> On 08/24/2015 03:58 PM, ISAAC J SCHWABACHER wrote: > [ijs] >>> I see how PEP 495 makes it possible to convert datetimes correctly >>> in all cases, >> > [Tim Peters] >> Whew! You may well be only the second person to grasp this - thanks :-) > > Wait... was this ever in contention? I don't think so. I'm not sure what Tim is referring to here; I haven't seen any messages in this thread that indicated a lack of understanding on this point. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Tue Aug 25 01:23:20 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 24 Aug 2015 19:23:20 -0400 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: <55DBA263.1010407@oddbird.net> References: <55DBA263.1010407@oddbird.net> Message-ID: On Mon, Aug 24, 2015 at 7:01 PM, Carl Meyer wrote: > On 08/24/2015 03:58 PM, ISAAC J SCHWABACHER wrote: > > [ijs] > >>> I see how PEP 495 makes it possible to convert datetimes correctly > >>> in all cases, > >> > > [Tim Peters] > >> Whew! You may well be only the second person to grasp this - thanks :-) > > > > Wait... was this ever in contention? > > I don't think so. I'm not sure what Tim is referring to here; I haven't > seen any messages in this thread that indicated a lack of understanding > on this point. > I cannot speak for Tim, but here is an exercise you can try to check your grasp of PEP 495: given PEP-compliant utcoffset() and dst() methods for the current rules in the US/Eastern timezone, write a PEP-compliant fromutc(). You can start from the current datetime.py implementation. [1] [1]: https://hg.python.org/cpython/file/v3.5.0rc1/Lib/datetime.py#l957 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ischwabacher at wisc.edu Mon Aug 24 22:28:38 2015 From: ischwabacher at wisc.edu (ISAAC J SCHWABACHER) Date: Mon, 24 Aug 2015 20:28:38 +0000 Subject: [Datetime-SIG] pytz vs. PEP 495 Was: PEP-431/495 In-Reply-To: References: <55DB6B01.8050606@stoneleaf.us> <55DB765B.8020702@stoneleaf.us> Message-ID: [Alexander Belopolsky] > >>> dt = time(2, first=None, tzinfo=Eastern2) [ijs] I *really* hope the answer to this one is, "don't do that". ijs -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Tue Aug 25 01:50:18 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 24 Aug 2015 18:50:18 -0500 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: <55DBA263.1010407@oddbird.net> References: <55DBA263.1010407@oddbird.net> Message-ID: [ijs] >>>> I see how PEP 495 makes it possible to convert datetimes correctly >>>> in all cases, [Tim] >>> Whew! You may well be only the second person to grasp this - thanks :-) [ijs] >> Wait... was this ever in contention? [Carl Meyer] > I don't think so. I'm not sure what Tim is referring to here; I haven't > seen any messages in this thread that indicated a lack of understanding > on this point. I've lost track of how often I've explained it. That suggests it's not being understood. Start with the message at the top of _this_ (renamed) thread, which was posted today, so it shouldn't fade entirely from memory for at least another 37 minutes ;-) > ... > Conversion was the larger issue, and to do that correctly the > arithmetic needed to be fixed. Isaac (quoted at the top of _this_ message) is the first to say "ya, I get it" in public. Someone else said it once in private, like a month ago ;-) From carl at oddbird.net Tue Aug 25 01:55:23 2015 From: carl at oddbird.net (Carl Meyer) Date: Mon, 24 Aug 2015 17:55:23 -0600 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: References: <55DBA263.1010407@oddbird.net> Message-ID: <55DBAEEB.4030701@oddbird.net> On 08/24/2015 05:23 PM, Alexander Belopolsky wrote: > > On Mon, Aug 24, 2015 at 7:01 PM, Carl Meyer > wrote: > > On 08/24/2015 03:58 PM, ISAAC J SCHWABACHER wrote: > > [ijs] > >>> I see how PEP 495 makes it possible to convert datetimes correctly > >>> in all cases, > >> > > [Tim Peters] > >> Whew! You may well be only the second person to grasp this - thanks :-) > > > > Wait... was this ever in contention? > > I don't think so. I'm not sure what Tim is referring to here; I haven't > seen any messages in this thread that indicated a lack of understanding > on this point. > > > I cannot speak for Tim, but here is an exercise you can try to check > your grasp of PEP 495: given PEP-compliant utcoffset() and dst() methods > for the current rules in the US/Eastern timezone, write a PEP-compliant > fromutc(). You can start from the current datetime.py implementation. [1] > > [1]: https://hg.python.org/cpython/file/v3.5.0rc1/Lib/datetime.py#l957 Looks fun; I'll give it a try later tonight. But you're changing the subject. That exercise requires far more detailed knowledge of the tzinfo APIs than is required to understand that a single added disambiguating bit is enough to handle timezone conversions, even in the presence of (two-layered) folds and gaps. I haven't seen anyone object to PEP 495 on the grounds that it doesn't provide enough information to handle timezone conversions (well, except in the as-yet-hypothetical case of a three-layered fold, which could be handled by your latest proposal to make `ltdf` an integer rather than a boolean). I've seen a lot of discussion of the name of the flag, and a couple people express a desire to have a simpler built-in spelling of "raise an exception on invalid/ambiguous time" than "convert twice with two different flag values and compare the results". Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Tue Aug 25 02:00:58 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 24 Aug 2015 20:00:58 -0400 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: <55DBAEEB.4030701@oddbird.net> References: <55DBA263.1010407@oddbird.net> <55DBAEEB.4030701@oddbird.net> Message-ID: On Mon, Aug 24, 2015 at 7:55 PM, Carl Meyer wrote: > But you're changing the subject. That exercise requires far more > detailed knowledge of the tzinfo APIs than is required to understand > that a single added disambiguating bit is enough to handle timezone > conversions, even in the presence of (two-layered) folds and gaps. > Sorry for that. Meanwhile, however Tim has responded and the point that both of us think is misunderstood is that contrary to Stuart's assertion below and at the top of this thread, all conversion issues can be addressed without any change to arithmetic. (Moreover, addressing them with classic arithmetic is easier.) [Stuart Bishop ] > ... > Conversion was the larger issue, and to do that correctly the > arithmetic needed to be fixed. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Tue Aug 25 02:07:31 2015 From: guido at python.org (Guido van Rossum) Date: Mon, 24 Aug 2015 17:07:31 -0700 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: References: <55DBA263.1010407@oddbird.net> <55DBAEEB.4030701@oddbird.net> Message-ID: Sounds to me like Tim and Stuart disagree on the definition of conversion, though. On Mon, Aug 24, 2015 at 5:00 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > On Mon, Aug 24, 2015 at 7:55 PM, Carl Meyer wrote: > >> But you're changing the subject. That exercise requires far more >> detailed knowledge of the tzinfo APIs than is required to understand >> that a single added disambiguating bit is enough to handle timezone >> conversions, even in the presence of (two-layered) folds and gaps. >> > > Sorry for that. Meanwhile, however Tim has responded and the point that > both of us think is misunderstood is that contrary to Stuart's assertion > below and at the top of this thread, all conversion issues can be addressed > without any change to arithmetic. (Moreover, addressing them with classic > arithmetic is easier.) > > [Stuart Bishop ] > > ... > > Conversion was the larger issue, and to do that correctly the > > arithmetic needed to be fixed. > > _______________________________________________ > Datetime-SIG mailing list > Datetime-SIG at python.org > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Tue Aug 25 02:15:18 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 24 Aug 2015 20:15:18 -0400 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: References: <55DBA263.1010407@oddbird.net> <55DBAEEB.4030701@oddbird.net> Message-ID: On Mon, Aug 24, 2015 at 8:07 PM, Guido van Rossum wrote: > Sounds to me like Tim and Stuart disagree on the definition of conversion, > though. That's not surprising given that Tim's .fromutc() [1] *does not* change .tzinfo (so arguably it is not a timezone conversion) but *does* involve arithmetic (because ignoring the DST nonsense, it is just dt + dt.utcoffset()). Fortunately, Tim has established that there are only two people in the world that need to understand any of this, but Stuart is one of them. :-) [1]: https://hg.python.org/cpython/file/v3.5.0rc1/Lib/datetime.py#l957 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Tue Aug 25 02:15:34 2015 From: carl at oddbird.net (Carl Meyer) Date: Mon, 24 Aug 2015 18:15:34 -0600 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: References: <55DBA263.1010407@oddbird.net> Message-ID: <55DBB3A6.802@oddbird.net> On 08/24/2015 05:50 PM, Tim Peters wrote: > [ijs] >>>>> I see how PEP 495 makes it possible to convert datetimes correctly >>>>> in all cases, > > [Tim] >>>> Whew! You may well be only the second person to grasp this - thanks :-) > > [ijs] >>> Wait... was this ever in contention? > > [Carl Meyer] >> I don't think so. I'm not sure what Tim is referring to here; I haven't >> seen any messages in this thread that indicated a lack of understanding >> on this point. > > I've lost track of how often I've explained it. That suggests it's > not being understood. Start with the message at the top of _this_ > (renamed) thread, which was posted today, so it shouldn't fade > entirely from memory for at least another 37 minutes ;-) Fair enough :-) I can't seem to find Stuart's message that you were replying to, but my (sometimes failing) memory thought it was more of a design justification of pytz (in a non-PEP-495 world) than a critique of PEP 495. In any case, since I've argued several times that naive arithmetic on aware datetimes is wrong, I'll be crystal clear: I don't think there's any question that it is possible to make conversions unambiguous without changing how datetime arithmetic works, and that PEP 495 achieves that much. My only quibbles about PEP 495 are minor ones: naming, API convenience for "strict mode", and the fact that it won't help me, because until we do change the arithmetic, I'll keep on using pytz and its fixed-offset tzinfo objects :-) Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From tim.peters at gmail.com Tue Aug 25 02:23:09 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 24 Aug 2015 19:23:09 -0500 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: References: <55DBA263.1010407@oddbird.net> <55DBAEEB.4030701@oddbird.net> Message-ID: [Guido] > Sounds to me like Tim and Stuart disagree on the definition of conversion, > though. I don't think so. The pytz docs contain a variant of the word "conversion" 6 times, each one followed by examples without any arithmetic but with calls to .astimezone(). Contrarily, the examples showing arithmetic are not preceded by text calling them any variant of "conversions". So, in all, it's more likely I had no real idea what "Conversion was the larger issue, and to do that correctly the arithmetic needed to be fixed" meant to begin with ;-) From alexander.belopolsky at gmail.com Tue Aug 25 02:30:36 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 24 Aug 2015 20:30:36 -0400 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: <55DBB3A6.802@oddbird.net> References: <55DBA263.1010407@oddbird.net> <55DBB3A6.802@oddbird.net> Message-ID: On Mon, Aug 24, 2015 at 8:15 PM, Carl Meyer wrote: > > My only quibbles about PEP 495 are minor ones: naming, ... I could not dig out your position on naming, so if you still have those "quibbles", please respond on the "PEP 495 Q & A" thread. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Tue Aug 25 02:34:50 2015 From: carl at oddbird.net (Carl Meyer) Date: Mon, 24 Aug 2015 18:34:50 -0600 Subject: [Datetime-SIG] PEP 495 Q & A In-Reply-To: References: Message-ID: <55DBB82A.9040901@oddbird.net> On 08/22/2015 08:20 PM, Tim Peters wrote: > [Alexander Belopolsky] >> ... >> Note that I did not include all suggestions for the name of the flag, but I >> thank everyone who made their suggestions. I think we are really left with >> two contenders: "fold" and "later." The only additional variant I would >> like to consider is "fold" with the integer values of 0 and 1. I think >> time(1, 30, fold=1) is short and sweet and looks better than time(1, 30, >> later=True). >> >> Note that neither spelling is self-explanatory, particularly if you see >> something like if dt.replace(later=True) < dt.replace(later=False) in >> someone's code, but the word "fold" points you in the right direction and is >> more Google-friendly than "later". >> >> The reason I think fold=0 and fold=1 may work better than booleans, is that >> you can think of the local time line as consisting of two "folds" one - the >> main timeline and the other a discontinuous line covering the fall-back >> hours. > > I'm on board with fold=0 and fold=1. I only hated "fold" when it was > False and True. Now we're indexing a theoretically unbounded sequence > of folds by an ordinal, which makes perfect sense - the later the > time, the larger the ordinal ;-) That's ok by me. It seems wrong to use `fold` to decide which side of a gap to choose, too -- but it's even more wrong to use `first` or `later` when they actually mean the reverse when disambiguating a gap. So I'll go with your explanation that a gap is a "negative fold" and be happy :-) Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From tim.peters at gmail.com Tue Aug 25 04:18:42 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 24 Aug 2015 21:18:42 -0500 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: [Stuart Bishop] I'm skipping to the end here to maintain some hope of temporal continuity with the "Conversion vs arithmetic" thread I already spun off: > For amusement, here is how you can add an hour and end up exactly > where you started. Careful you do your conversions at the right time, > or the dst transition might eat your data (this example performed by a > professional stuntman and should not be attempted at home): > > >>> from pytz.reference import Eastern > >>> dt = datetime(2004, 4, 4, 1, 0, 0, tzinfo=Eastern) > >>> str(dt.astimezone(timezone.utc)) > '2004-04-04 06:00:00+00:00' > >>> str((dt + timedelta(hours=1)).astimezone(timezone.utc)) > '2004-04-04 06:00:00+00:00' The missing bit there: from datetime import datetime, timezone, timedelta at the start. For those who don't know (like me ;-) ), pytz.reference.Eastern appears to be a copy/paste from the first version of the datetime docs, showing a very simple implementation of US daylight rules at the time it was written (transitions at 2am local time on the first Sunday of April and the last Sunday of October). Eastern's standard offset is -5 (and daylight -4). So you're creating a gap time at DST start, via classic arithmetic, and marveling ;-) at the senseless output. The original dt is "in standard time", and correctly maps to the UTC hour 5 hours later. Adding an hour creates 2am standard time, which doesn't exist on the local clock (which jumps from 1:59:59 to 3:00:00). But since 2 is >= 2, it's considered to be in daylight time, and offset -4 maps to the same 6am UTC. In timeline arithmetic, the addition would have jumped to 3am, and offset -4 would map to 7am UTC. It would be a mistake to believe it wasn't all known a dozen years ago that stuff like that would happen. We were acutely aware of it. Regardless, classic arithmetic was a deliberate design decision. I don't think it would help to keep repeating why, not any more than it would help to keep showing examples where timeline and classic arithmetic give different results. Everyone already knows the latter, and everyone already knows some people hate it ;-) Why I wanted to isolate this part was for the _conversion_ question, which is a source of surprises divorced from arithmetic. Like so: u = datetime(2004, 10, 31, 5, 27, tzinfo=timezone.utc) print(u) print(u.astimezone(Eastern).astimezone(timezone.utc)) There I contrived to create a UTC time in the fold at Eastern DST end (in the same year as your example). Then roundtrip to Eastern and back The output: 2004-10-31 05:27:00+00:00 2004-10-31 06:27:00+00:00 So that's how you can do nothing at all yet end up an hour away ;-) In that case, `u` maps to 1:27am Eastern daylight time, which is ambiguous in Eastern. Because it's impossible now to record that the earlier of the ambiguous hours was picked, on the trip back it's considered to be in standard time, so maps to the next UTC hour. _This_ part has nothing to do with timeline arithmetic, and (as explained in the "Conversion vs arithmetic" spinoff) will fix itself by magic as soon as `Eastern` and `timezone.utc` tzinfos implementing PEP 495 exist. Conversion surprises have nothing to do with arithmetic; they're solely due to the lack of a disambiguation bit. From stuart at stuartbishop.net Tue Aug 25 05:02:16 2015 From: stuart at stuartbishop.net (Stuart Bishop) Date: Tue, 25 Aug 2015 10:02:16 +0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: <55DB57E7.20302@oddbird.net> Message-ID: On 25 August 2015 at 01:02, Guido van Rossum wrote: > I'm still confused about what makes Stuart believe a tzinfo database must > also change the arithmetic rules (especially since Gustavo's dateutils > apparently gets along quite well without this). A tzinfo database does not care about the arithmetic rules. The database provided by pytz is designed in such a way that you get correct timeline arithmetic out of Python's datetime library, by exposing each period in the timezone definition as distinct fixed offset tzinfo instances. >>> from datetime import datetime, timezone, timedelta >>> from pytz.reference import Eastern as reftz # Tim's reference implementation >>> import pytz >>> dt = datetime(2004, 4, 4, 6, 59, tzinfo=timezone.utc) >>> str(dt.astimezone(pytz.timezone('US/Eastern'))) '2004-04-04 01:59:00-05:00' >>> str(dt.astimezone(reftz)) '2004-04-04 01:59:00-05:00' >>> str(dt.astimezone(pytz.timezone('US/Eastern')) + timedelta(minutes=1)) '2004-04-04 02:00:00-05:00' >>> str(dt.astimezone(reftz) + timedelta(minutes=1)) '2004-04-04 02:00:00-04:00' So I think pytz's current implementation is entwined with the arithmetic style. But the tzfile loader in pytz could be reworked to present tzinfo instances suitable for classic arithmetic if you want it in stdlib, or the parser out of dateutils used. >> "Classic behaviour as you describe it is a bug. It sounds ok when you >> state it as 'add one day to today and you get the same time tomorrow'. >> It does not sound ok when you state it as 'add one second to now and >> you will normally get now + 1 second, but sometimes you will get an >> instant further in the future, and sometimes you will get an instant >> in the past'." > > That sounds tautological to me -- "it's a bug because I find it buggy". I call it a bug because adding 1 second to a moment in time and having it incremented by an entire hour is extremely surprising behaviour. This surprising behaviour also makes it impossible to reliably add an absolute value to a timezone aware datetime. While people can understand that adding a relative time period such as 1 day might get you the same time tomorrow, it makes absolutely no sense that adding an absolute time period like 1 second or 24 hours does not give you an absolute answer. It is unfortunate that in this case the datetime library interprets a timedelta as a relative period when it is documented as being an absolute period. It maybe documented behaviour, but the vast majority will consider it a bug and will consider Python's datetime handling broken. At the moment, hardly anybody is aware of the issue. I'd never even really considered it myself - I was just trying to get my library to spit the right answers out. > Maybe the underlying reason is that to me, even a datetime with tzinfo does > not refer to an instant -- it refers to something that's displayed on a > clock. To me, arithmetic is a way of moving the hands of the clock. In this definition, the current arithmetic fails. If you have Tim's clock or a phone which displays the utcoffset, there are positions you cannot move the clock hands to. In a fold you can only set the hands to either the pre or post DST position, but not both. But you can set the hands to invalid positions if you want to confuse people. If, however, you stop ignoring the tzinfo you certainly can consider arithmetic as a way of moving the hands of the clock. Add one minute, and the minute hand goes from 59 to 0 and the hour hand increments by one. Add one hour and the hour hand decrements and the utcoffset is changed. All valid positions can be reached, and invalid ones helpfully avoided. -- Stuart Bishop http://www.stuartbishop.net/ From tim.peters at gmail.com Tue Aug 25 05:55:59 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 24 Aug 2015 22:55:59 -0500 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: References: Message-ID: [Tim] >> ... >> In any case, there is no active PEP at the moment proposing to add any >> new timezone implementations to Python. One step at a time. [ijs] > PEP 431 did. Hence my "active" PEP. Lennart withdrew 431. > But I agree, PEP 495 is a small and straightforward step in the > right direction. Though it's not looking like the bikeshed is going > to get painted my color. :) What if we renamed "fold" to "ijs"? We aim to please ;-) > ... > I assume pytz is committed for backward compatibility to timeline > arithmetic for datetime subtraction, which means it can't afford to > change its "one tz instance per offset" design as long as datetime > arithmetic ignores the time zone when it's identical. For backward compatibility, and for consistency with classic datetime + timedelta addition, the default datetime subtraction won't change. They both have to use the same kind of arithmetic for the obvious identities to hold. But nothing in PEP 495 precludes offering optional timeline arithmetic later - arithmetic just isn't in 495's scope. > There's no hook for pytz to hang its behavior on, and PEP 495 > doesn't give it one. I don't grasp why people keep bringing up issues PEP 495 never intended to address when talking _about_ 495. PEP 495 is _not_ a proposed substitute for PEP 431, and Alexander never claimed it was. It's a different path entirely, prudently (IMO) proposing a relatively small change to get things moving in a useful direction. Even if all related activity stops with 495, fixing conversions alone is worth some effort. Not even appealing to "naive time" can really excuse screwing up conversions _between_ naive times ;-) > On the upside, it does look like there's a way to foil the dreaded > self check in datetime_astimezone and make dt.astimezone(tz) > work as intended. I can pretty much guarantee it always worked as intended ;-) If you mean that 495 will make it work when dt is naive, that's marginal to me. Treating a naive datetime as being a local time is as senseless as treating it as a UTC time - it's arbitrary. I'd rather datetime.timezone grew a name for a tzinfo representing the system timezone, so "local time" became as easy to spell explicitly (when desired) as "utc" has become already. But that's out of scope for 495 too. From 4kir4.1i at gmail.com Tue Aug 25 08:39:27 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Tue, 25 Aug 2015 09:39:27 +0300 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: (Stuart Bishop's message of "Tue, 25 Aug 2015 10:02:16 +0700") References: <55DB57E7.20302@oddbird.net> Message-ID: <876143an34.fsf@gmail.com> Stuart Bishop writes: > On 25 August 2015 at 01:02, Guido van Rossum wrote: > >> I'm still confused about what makes Stuart believe a tzinfo database must >> also change the arithmetic rules (especially since Gustavo's dateutils >> apparently gets along quite well without this). > > A tzinfo database does not care about the arithmetic rules. > > The database provided by pytz is designed in such a way that you get > correct timeline arithmetic out of Python's datetime library, by > exposing each period in the timezone definition as distinct fixed > offset tzinfo instances. > >>>> from datetime import datetime, timezone, timedelta >>>> from pytz.reference import Eastern as reftz # Tim's reference implementation >>>> import pytz >>>> dt = datetime(2004, 4, 4, 6, 59, tzinfo=timezone.utc) >>>> str(dt.astimezone(pytz.timezone('US/Eastern'))) > '2004-04-04 01:59:00-05:00' >>>> str(dt.astimezone(reftz)) > '2004-04-04 01:59:00-05:00' >>>> str(dt.astimezone(pytz.timezone('US/Eastern')) + timedelta(minutes=1)) > '2004-04-04 02:00:00-05:00' >>>> str(dt.astimezone(reftz) + timedelta(minutes=1)) > '2004-04-04 02:00:00-04:00' > > So I think pytz's current implementation is entwined with the > arithmetic style. But the tzfile loader in pytz could be reworked to > present tzinfo instances suitable for classic arithmetic if you want > it in stdlib, or the parser out of dateutils used. As far as I know utc -> local timezone *conversions* do not work during DST transitions in dateutil [1] while pytz manages just fine. Let's not do, whatever dateutil does here. It would be a regression. The result does not depend on the datetime implementation details. Given a specific tz db version the utc -> local conversion is not a matter of interpretation for recent dates. pytz is the only Python library that does timezone *conversions* correctly for almost all timezones from the tz database in a +/- decade time range. If pep-495 does not help pytz with timezone conversions then what is the point of pep-495? [1] https://github.com/dateutil/dateutil/issues/112 From stuart at stuartbishop.net Tue Aug 25 11:28:13 2015 From: stuart at stuartbishop.net (Stuart Bishop) Date: Tue, 25 Aug 2015 16:28:13 +0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: <876143an34.fsf@gmail.com> References: <55DB57E7.20302@oddbird.net> <876143an34.fsf@gmail.com> Message-ID: On 25 August 2015 at 13:39, Akira Li <4kir4.1i at gmail.com> wrote: > As far as I know utc -> local timezone *conversions* do not work during > DST transitions in dateutil [1] while pytz manages just fine. > > Let's not do, whatever dateutil does here. It would be a regression. dateutils does not work at the moment. With the addition of the new flag proposed by PEP-495 it should be a simple update to make dateutils work. Similarly, depending on the semantics of the updated datetime constructors it may be possible to drop pytz' localize method and have it work better. -- Stuart Bishop http://www.stuartbishop.net/ From 4kir4.1i at gmail.com Tue Aug 25 13:44:12 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Tue, 25 Aug 2015 14:44:12 +0300 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: (Stuart Bishop's message of "Tue, 25 Aug 2015 16:28:13 +0700") References: <55DB57E7.20302@oddbird.net> <876143an34.fsf@gmail.com> Message-ID: <87zj1f8uer.fsf@gmail.com> Stuart Bishop writes: > On 25 August 2015 at 13:39, Akira Li <4kir4.1i at gmail.com> wrote: > >> As far as I know utc -> local timezone *conversions* do not work during >> DST transitions in dateutil [1] while pytz manages just fine. >> >> Let's not do, whatever dateutil does here. It would be a regression. > > dateutils does not work at the moment. With the addition of the new > flag proposed by PEP-495 it should be a simple update to make > dateutils work. Similarly, depending on the semantics of the updated > datetime constructors it may be possible to drop pytz' localize method > and have it work better. It is not clear why one need a disambiguation flag if there is no ambiguity in utc -> local timezone conversions but it is great that PEP-495 might help fix the dateutil bug [1] [1] https://github.com/dateutil/dateutil/issues/112 Should it also help fix datetime.now(tz.tzlocal()) [2] bug -- getting the current time in the local timezone as an aware datetime object? [2] https://github.com/dateutil/dateutil/issues/57 note: stdlib variant datetime.now(timezone.utc).astimezone() may fail if it uses time.timezone, time.tzname internally [3,4,5] when tm_gmtoff tm_zone are not available on a given platform. [3] http://bugs.python.org/issue1647654 [4] http://bugs.python.org/issue22752 [5] http://bugs.python.org/issue22798 pytz variant datetime.now(tzlocal.get_localzone()) works even during DST transitions. If PEP-495 implies that dateutil type of timezone implementation might be adapted in stdlib; it might be worth mentioning to what types of bugs it leads to. From alexander.belopolsky at gmail.com Tue Aug 25 14:09:45 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 08:09:45 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: <87zj1f8uer.fsf@gmail.com> References: <55DB57E7.20302@oddbird.net> <876143an34.fsf@gmail.com> <87zj1f8uer.fsf@gmail.com> Message-ID: > On Aug 25, 2015, at 7:44 AM, Akira Li <4kir4.1i at gmail.com> wrote: > > note: stdlib variant datetime.now(timezone.utc).astimezone() may fail if it > uses time.timezone, time.tzname internally [3,4,5] when tm_gmtoff > tm_zone are not available on a given platform. If this actually happens on any supported platform - please file a bug report. What we do in this case is not as simplistic as you describe. From stuart at stuartbishop.net Tue Aug 25 15:30:50 2015 From: stuart at stuartbishop.net (Stuart Bishop) Date: Tue, 25 Aug 2015 20:30:50 +0700 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking Message-ID: There is a point I'd like to clarify in the current PEP: """ Strict Invalid Time Checking Another suggestion was to use first=-1 or first=None to indicate that the program truly has no means to deal with the folds and gaps and dt.utcoffset() should raise an error whenever dt represents an ambiguous or missing local time. The main problem with this proposal, is that dt.utcoffset() is used internally in situations where raising an error is not an option: for example, in dictionary lookups or list/set membership checks. So strict gap/fold checking behavior would need to be controlled by a separate flag, say dt.utcoffset(raise_on_gap=True, raise_on_fold=False) . """ As mentioned elsewhere, pytz requires strict checking to remain backwards compatible. The description above does not match the desired behaviour though. pytz users need to optionally have exceptions raised when they try to construct an invalid or ambiguous datetime instance, directly via __new__ or indirectly with something like dt.replace(). If these methods are called with first=None, it will be passed through to dt.utcoffset() and it may raise an exception. dt.utcoffset() will only ever raise an exception when the user has explicitly requested construction of a datetime with strict checking. It will never raise an exception in normal operation, including arithmetic, and could not cause problems in the situations you cite. This would require the first argument to be available in all the methods that can construct datetime instances, including things not mentioned in the PEP like dt.astimezone() (Where I think it might be using the first flag from the original dt instance, rather than allowing the user to specify which side of the fold in the target zone). -- Stuart Bishop http://www.stuartbishop.net/ From alexander.belopolsky at gmail.com Tue Aug 25 16:51:16 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 10:51:16 -0400 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: References: Message-ID: On Tue, Aug 25, 2015 at 9:30 AM, Stuart Bishop wrote: > As mentioned elsewhere, pytz requires strict checking to remain > backwards compatible. > Can you provide the specific examples where strict checking is required? Since pytz already has a disambiguation solution, PEP 495 is not as indispensable for it as it is for datetime or dateutil. However, there is one case I can think of where pytz will benefit: with the PEP, it will be possible to make say Eastern.localize(datetime.now()) work correctly at all times. If for backward compatibility, you want to continue raising AmbiguousTimeError during one hour each year, I am sure you will figure out how to make Eastern.localize(datetime.now(), isdst=None). (Hint: Don't change anything in this branch of your code.) > The description above does not match the desired > behaviour though. > I assume you refer to "Another suggestion was to use first=-1 or first=None to indicate that the program truly has no means to deal with the folds and gaps and dt.utcoffset() should raise an error whenever dt represents an ambiguous or missing local time." It looks like you want to make it impossible to construct invalid dt instances. In other words, you want to make dt.replace(fold=-1) or dt.replace(tzinfo=Eastern) raise an error under certain circumstances. Is this right? > pytz users need to optionally have exceptions raised > when they try to construct an invalid or ambiguous datetime instance, > This is a legitimate need, by why does it need to be done in datetime rather than in pytz itself? You already ignore the datetime(..., tzinfo=...) constructor and require your users to call localize() instead. What stops you from providing a function strict_datetime() that will perform any checks that you or your users desire? > directly via __new__ or indirectly with something like dt.replace(). > __new__ and .replace() are low level methods called in many performance critical places. We cannot afford to call arbitrary python code in those methods. > If these methods are called with first=None, it will be passed through > to dt.utcoffset() and it may raise an exception. > This part I don't understand. If __new__ raises an exception - you will have no instance to "be passed through to dt.utcoffset()." > dt.utcoffset() will > only ever raise an exception when the user has explicitly requested > construction of a datetime with strict checking. > If you allow constructing instances with failing .utcoffset(), these instances will make innocent-looking code capable of raising an error. For example dt1 in {dt2} will raise the same error as dt2.utcoffset() regardless of what dt1 is. You will have similar problems with dt1 == dt2, dictionary lookups, list searches and so on. > It will never raise > an exception in normal operation, including arithmetic, and could not > cause problems in the situations you cite. > If I have two instances dt1 and dt2 with different .tzinfo and dt1.utcoffset() raises an exception. What dt1 - dt2 should return? or dt1 == dt2? or hash(dt1)? > > This would require the first argument to be available in all the > methods that can construct datetime instances, including things not > mentioned in the PEP like dt.astimezone() > .astimezone() is mentioned in the PEP [1], but I should probably add explicit discussion of how it should handle invalid times. In my view, the behavior of .astimezone() follows straight from that of .utcoffset(), but it may not be obvious from the PEP alone. > (Where I think it might be > using the first flag from the original dt instance, rather than > allowing the user to specify which side of the fold in the target > zone). > .astimezone() does not and need not "allow the user to specify which side of the fold in the target zone." As long as it knows how to interpret the time that it is given (disambiguate the fold and "normalize" the gap) it should be able to set the fold=1 attribute correctly in the result if it happen to fall into the repeated hour and should never produce a time that is in the gap of the target zone. [1]: https://www.python.org/dev/peps/pep-0495/#conversion-from-naive-to-aware -------------- next part -------------- An HTML attachment was scrubbed... URL: From 4kir4.1i at gmail.com Tue Aug 25 17:47:30 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Tue, 25 Aug 2015 18:47:30 +0300 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: (Alexander Belopolsky's message of "Tue, 25 Aug 2015 08:09:45 -0400") References: <55DB57E7.20302@oddbird.net> <876143an34.fsf@gmail.com> <87zj1f8uer.fsf@gmail.com> Message-ID: <87vbc38j59.fsf@gmail.com> Alexander Belopolsky writes: >> On Aug 25, 2015, at 7:44 AM, Akira Li <4kir4.1i at gmail.com> wrote: >> >> note: stdlib variant datetime.now(timezone.utc).astimezone() may fail if it >> uses time.timezone, time.tzname internally [3,4,5] when tm_gmtoff >> tm_zone are not available on a given platform. > > If this actually happens on any supported platform - please file a bug > report. What we do in this case is not as simplistic as you describe. Bug-driven development is probably not the best strategy for a datetime library ;) Tests can't catch all bugs. I've found out that astimezone() may fail by *reading* its source and trying to *understand* what it does. Here's the part from datetime.py [1] that computes the local timezone if tm_gmtoff or tm_zone are not available: # Compute UTC offset and compare with the value implied # by tm_isdst. If the values match, use the zone name # implied by tm_isdst. delta = local - datetime(*_time.gmtime(ts)[:6]) dst = _time.daylight and localtm.tm_isdst > 0 gmtoff = -(_time.altzone if dst else _time.timezone) if delta == timedelta(seconds=gmtoff): tz = timezone(delta, _time.tzname[dst]) else: tz = timezone(delta) Here's its C equivalent [2]. Python issues that I've linked in the previous message [3,4,5] demonstrate that time.timezone and time.tzname may have wrong values and therefore the result *tz* may have a wrong tzname. Here's an example inspired by "incorrect time.timezone value" Python issue [4]: >>> from datetime import datetime, timezone >>> from email.utils import parsedate_to_datetime >>> import tzlocal # to get local timezone as pytz timezone >>> d = parsedate_to_datetime("Tue, 28 Oct 2013 14:27:54 +0000") >>> # expected (TZ=Europe/Moscow) ... >>> d.astimezone(tzlocal.get_localzone()).strftime('%Z%z') 'MSK+0400' >>> # got ... >>> d.astimezone().strftime('%Z%z') 'UTC+04:00+0400' 'UTC+04:00' instead of 'MSK' is not a major issue. I don't consider it a bug because without access to the tz database stdlib can't do much better, there always be cases when it breaks. I just use pytz instead which does provide access to the tz database. [1] https://github.com/python/cpython/blob/fced0e12fc510e4a6158628695774ccfd02395d3/Lib/datetime.py#L1513-L1522 [2] https://github.com/python/cpython/blob/fced0e12fc510e4a6158628695774ccfd02395d3/Modules/_datetimemodule.c#L4721-L4735 [3] http://bugs.python.org/issue1647654 [4] http://bugs.python.org/issue22752 [5] http://bugs.python.org/issue22798 From alexander.belopolsky at gmail.com Tue Aug 25 19:08:44 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 13:08:44 -0400 Subject: [Datetime-SIG] time module issues Was: PEP-431/495 Message-ID: I am changing the subject line because neither of the PEPs mentioned in the original subject propose any changes to the time module. It is also likely that this discussion is off-topic on the Datetime-SIG mailing list and we should continue it on the relevant bug tracker issues. On Tue, Aug 25, 2015 at 11:47 AM, Akira Li <4kir4.1i at gmail.com> wrote: > > Alexander Belopolsky writes: > > >> On Aug 25, 2015, at 7:44 AM, Akira Li <4kir4.1i at gmail.com> wrote: > >> > >> note: stdlib variant datetime.now(timezone.utc).astimezone() may fail if it > >> uses time.timezone, time.tzname internally [3,4,5] when tm_gmtoff > >> tm_zone are not available on a given platform. > > > > If this actually happens on any supported platform - please file a bug > > report. What we do in this case is not as simplistic as you describe. > > Bug-driven development is probably not the best strategy for a datetime > library ;) Tests can't catch all bugs. I've found out that astimezone() > may fail by *reading* its source and trying to *understand* what it does. I agree, but once you've read the code and see any logical errors, you should be able to construct a test case demonstrating wrong behavior. > > Here's the part from datetime.py [1] that computes the local timezone if > tm_gmtoff or tm_zone are not available: > > # Compute UTC offset and compare with the value implied > # by tm_isdst. If the values match, use the zone name > # implied by tm_isdst. > delta = local - datetime(*_time.gmtime(ts)[:6]) > dst = _time.daylight and localtm.tm_isdst > 0 > gmtoff = -(_time.altzone if dst else _time.timezone) > if delta == timedelta(seconds=gmtoff): > tz = timezone(delta, _time.tzname[dst]) > else: > tz = timezone(delta) > > Here's its C equivalent [2]. > > Python issues that I've linked in the previous message [3,4,5] demonstrate > that time.timezone and time.tzname may have wrong values and therefore > the result *tz* may have a wrong tzname. To summarize for those who will not follow the links: [3] Is a closed "No obvious and correct way to get the time zone offset" issue. It was superseded by which in turn was closed by implementing the argument-less .astimezone() method. [4] and [5] are time module issues. > > Here's an example inspired by > "incorrect time.timezone value" Python issue [4]: > > >>> from datetime import datetime, timezone > >>> from email.utils import parsedate_to_datetime > >>> import tzlocal # to get local timezone as pytz timezone > >>> d = parsedate_to_datetime("Tue, 28 Oct 2013 14:27:54 +0000") > >>> # expected (TZ=Europe/Moscow) > ... > >>> d.astimezone(tzlocal.get_localzone()).strftime('%Z%z') > 'MSK+0400' > >>> # got > ... > >>> d.astimezone().strftime('%Z%z') > 'UTC+04:00+0400' > I don't understand why you keep presenting a mix of pytz, email.utils and something called "tzlocal" and then claim that the unexpected behavior indicates a problem in the datetime module? It could as well be in any of the three other modules that you use or in the way you combine them. If you want to parse the string "Tue, 28 Oct 2013 14:27:54 +0000" and convert it to Moscow time, here is how you do it using the datetime module: >>> import os; os.environ['TZ'] = 'Europe/Moscow' >>> from datetime import datetime >>> d = datetime.strptime("Tue, 28 Oct 2013 14:27:54 +0000", "%a, %d %b %Y %H:%M:%S %z") >>> d.astimezone().strftime("%F %T %Z%z") '2013-10-28 18:27:54 MSK+0400' Does this code behave differently on your system? If it does - please file a bug report. > > 'UTC+04:00' instead of 'MSK' is not a major issue. I don't consider it a > bug because without access to the tz database stdlib can't do much > better, there always be cases when it breaks. It is quite possible that that such cases exist, but you have not demonstrated one. > > I just use pytz instead which does provide access to the tz database. This will always be your option as it is your option to use just the datetime module. In both cases you can write correct code if you follow the reference manual or buggy code if you don't. An almost sure way to write buggy code is to use one library manual to write code using another. > > > [1] https://github.com/python/cpython/blob/fced0e12fc510e4a6158628695774ccfd02395d3/Lib/datetime.py#L1513-L1522 > [2] https://github.com/python/cpython/blob/fced0e12fc510e4a6158628695774ccfd02395d3/Modules/_datetimemodule.c#L4721-L4735 > [3] http://bugs.python.org/issue1647654 > [4] http://bugs.python.org/issue22752 > [5] http://bugs.python.org/issue22798 -------------- next part -------------- An HTML attachment was scrubbed... URL: From stuart at stuartbishop.net Tue Aug 25 19:56:07 2015 From: stuart at stuartbishop.net (Stuart Bishop) Date: Wed, 26 Aug 2015 00:56:07 +0700 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: References: Message-ID: On 25 August 2015 at 21:51, Alexander Belopolsky wrote: > On Tue, Aug 25, 2015 at 9:30 AM, Stuart Bishop > wrote: >> >> As mentioned elsewhere, pytz requires strict checking to remain >> backwards compatible. > > Can you provide the specific examples where strict checking is required? Systems where it is better to fail than continue with incorrect results. For example, ingesting transaction logs. It is more desirable for the script parsing the log files to fail with a traceback than to feed incorrect results into the rest of the system, where some poor DBA is going to have to repair the cascade of damage months or years later. Yes, pytz already has a disambiguation solution. I would love to be able to deprecate it, and encourage users to use stdlib as much as possible. > Since pytz already has a disambiguation solution, PEP 495 is not as > indispensable for it as it is for datetime or dateutil. However, there is > one case I can think of where pytz will benefit: with the PEP, it will be > possible to make say Eastern.localize(datetime.now()) work correctly at all > times. If for backward compatibility, you want to continue raising > AmbiguousTimeError during one hour each year, I am sure you will figure out > how to make Eastern.localize(datetime.now(), isdst=None). (Hint: Don't > change anything in this branch of your code.) Yes, that is nice. What would be even nicer is if users didn't have to use localize at all: >> datetime.now(tz=pytz.timezone('US/Eastern')) > I assume you refer to "Another suggestion was to use first=-1 or first=None > to indicate that > the program truly has no means to deal with the folds and gaps and > dt.utcoffset() should raise an error whenever dt represents an > ambiguous or missing local time." > > It looks like you want to make it impossible to construct invalid dt > instances. In other words, you want to make dt.replace(fold=-1) or > dt.replace(tzinfo=Eastern) raise an error under certain circumstances. Is > this right? Right. When a user requests that exceptions are raised, it becomes impossible to construct invalid dt instances. This does require calling pytz Python code from the datetime constructor, which you discuss below. >> pytz users need to optionally have exceptions raised >> when they try to construct an invalid or ambiguous datetime instance, > > This is a legitimate need, by why does it need to be done in datetime rather > than in pytz itself? You already ignore the datetime(..., tzinfo=...) > constructor and require your users to call localize() instead. What stops > you from providing a function strict_datetime() that will perform any checks > that you or your users desire? My goal is to have no pytz specific API, or at least minimize it and make it unnecessary for the most common use cases. There will be much less confusion, and it will be easier for large projects to migrate to stdlib if it meets their needs. >> directly via __new__ or indirectly with something like dt.replace(). > > __new__ and .replace() are low level methods called in many performance > critical places. We cannot afford to call arbitrary python code in those > methods. This seems to be the crux of the issue. The datetime constructor needs to call into the tzinfo implementation, in the same way as converting from utc time invokes tzinfo.fromutc(dt). If it did this, pytz could swap in the correct fixed offset tzinfo and, if requested, perhaps raise an exception. The localize method would be gone entirely, the biggest half of pytz' problematic API. The datetimes filtered through pytz would always be valid, and dt.utcoffset() will never raise an exception causing confusing failures. Is the overhead on calling a method on the tzinfo that bad? If the tzinfo implementation does not override it, it should still be fast. pytz users are already paying the overhead in the form of the localize method (which seems about 20x slower than just using datetime.now() unwrapped, 20usec vs 2usec according to timeit. But if you cared, you would be using time.time() at 0.08usec). Is there some way of me providing a hook that doesn't cause major slowdowns for non pytz users? For what its worth, the only complaints I've ever had about performance with pytz have been about how long it took to import the package. I'm not sure what sort of application instantiates so many timezone aware datetime instances that constructor overhead becomes noticeable. All examples I can come up with that create timestamps so rapidly would never add timezone information, and would be using a float or long for further optimization in any case. >> If these methods are called with first=None, it will be passed through >> to dt.utcoffset() and it may raise an exception. > > This part I don't understand. If __new__ raises an exception - you will > have no instance to "be passed through to dt.utcoffset()." Sorry. I'm mixing up tzinfo.utcoffset() and dt.utcoffset() here. > .astimezone() does not and need not "allow the user to specify which side > of the fold in the target zone." As long as it knows how to interpret the > time that it is given (disambiguate the fold and "normalize" the gap) it > should be able to set the fold=1 attribute correctly in the result if it > happen to fall into the repeated hour and should never produce a time that > is in the gap of the target zone. Yes. I got this backwards sorry. -- Stuart Bishop http://www.stuartbishop.net/ From ischwabacher at wisc.edu Tue Aug 25 20:37:10 2015 From: ischwabacher at wisc.edu (ISAAC J SCHWABACHER) Date: Tue, 25 Aug 2015 18:37:10 +0000 Subject: [Datetime-SIG] Conversion vs arithmetic (was Re: Is EDT a timezone? Was: PEP-0500) In-Reply-To: References: Message-ID: [ijs] > > But I agree, PEP 495 is a small and straightforward step in the > > right direction. Though it's not looking like the bikeshed is going > > to get painted my color. :) [Tim] > What if we renamed "fold" to "ijs"? We aim to please ;-) Heh. I argued that pandas should call it "ambiguous" and allow it the values True | False | 'NaT' | 'raise' | 'infer', where the boolean values had the same sense as if the flag were "is_dst" or "first" and 'NaT' and 'infer' were important for the vectorized case. And I convinced them, and they painted the bikeshed my color, and now I have doubts and wonder whether the flag should simply be called "is_dst" since that's what everyone else calls it (so that must be how they think of it, right?) and accompanied by a stern corner-case warning in the docs. Which is to say that I think it's important that the flag's name be grokkable. "is_dst" is grokkable. "first" is grokkable, once you get past the point of "first what?". "fold" is not grokkable. It's also not discoverable, which is also a problem for "first"-- the name of the field doesn't tell you what problem it solves. Nobody is going to be confused about what "is_dst" means (which is great until you get to a corner case where it doesn't mean that). So I'm torn between "is_dst" and "first", but I really don't like "fold". And that's all I have to say about that. > > ... > > I assume pytz is committed for backward compatibility to timeline > > arithmetic for datetime subtraction, which means it can't afford to > > change its "one tz instance per offset" design as long as datetime > > arithmetic ignores the time zone when it's identical. > > For backward compatibility, and for consistency with classic datetime > + timedelta addition, the default datetime subtraction won't change. > They both have to use the same kind of arithmetic for the obvious > identities to hold. But nothing in PEP 495 precludes offering > optional timeline arithmetic later - arithmetic just isn't in 495's > scope. > > > There's no hook for pytz to hang its behavior on, and PEP 495 > > doesn't give it one. > > I don't grasp why people keep bringing up issues PEP 495 never > intended to address when talking _about_ 495. PEP 495 is _not_ a > proposed substitute for PEP 431, and Alexander never claimed it was. > It's a different path entirely, prudently (IMO) proposing a relatively > small change to get things moving in a useful direction. Even if all > related activity stops with 495, fixing conversions alone is worth > some effort. Not even appealing to "naive time" can really excuse > screwing up conversions _between_ naive times ;-) I brought it up because I'm looking at PEP 495 through the lens of "how much simpler does this allow pytz's API to become?" Support for timeline arithmetic is a major constraint on pytz's design, and it could hamper that module's ability to make use of the gains from PEP 495-- in which case I would argue that the scope of the PEP should be expanded at least enough to permit pytz to partake. I initially thought this would be necessary, but then reread the relevant parts of _datetimemodule.c and came to the conclusion that it's not. But I admit I did not motivate the connection in my post. > > On the upside, it does look like there's a way to foil the dreaded > > self check in datetime_astimezone and make dt.astimezone(tz) > > work as intended. > > I can pretty much guarantee it always worked as intended ;-) As Stuart intended, when tz is a pytz.DstTzInfo object and in the corner case that dt is a non-normalized datetime in the same time zone (i.e., dt.tzinfo.utcoffset() returns the wrong offset). Jeez, how could that not be obvious? ;D > If you > mean that 495 will make it work when dt is naive, that's marginal to > me. Treating a naive datetime as being a local time is as senseless > as treating it as a UTC time - it's arbitrary. I'd rather > datetime.timezone grew a name for a tzinfo representing the system > timezone, so "local time" became as easy to spell explicitly (when > desired) as "utc" has become already. But that's out of scope for 495 > too. That's not what I mean; I agree with you here. ijs From alexander.belopolsky at gmail.com Tue Aug 25 20:42:53 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 14:42:53 -0400 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: References: Message-ID: On Tue, Aug 25, 2015 at 1:56 PM, Stuart Bishop wrote: > On 25 August 2015 at 21:51, Alexander Belopolsky > wrote: > > > On Tue, Aug 25, 2015 at 9:30 AM, Stuart Bishop > > wrote: > >> > >> As mentioned elsewhere, pytz requires strict checking to remain > >> backwards compatible. > > > > Can you provide the specific examples where strict checking is required? > > Systems where it is better to fail than continue with incorrect > results. For example, ingesting transaction logs. It is more desirable > for the script parsing the log files to fail with a traceback than to > feed incorrect results into the rest of the system, where some poor > DBA is going to have to repair the cascade of damage months or years > later. > Speaking of DBAs, how would she feel if a system zoneinfo upgrade made her database unreadable? A zoneinfo upgrade can easily create new gaps and folds even in the past if someone at IANA decides that they found a better source of historical information. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Tue Aug 25 20:49:21 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 14:49:21 -0400 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: References: Message-ID: On Tue, Aug 25, 2015 at 1:56 PM, Stuart Bishop wrote: > What would be even nicer is if users didn't have to use localize at all: > > >> datetime.now(tz=pytz.timezone('US/Eastern')) > This is certainly one of the main goals of PEP 495. Note that datetime.now() will never produce an invalid datetime and will always set the fold attribute correctly in the (two-)fold case if pytz.timezone('US/Eastern') follows PEP guidelines. Therefore there is no need for the third fold value in this case. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Tue Aug 25 21:36:46 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Aug 2015 13:36:46 -0600 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: References: Message-ID: <55DCC3CE.7070200@oddbird.net> Hi Alexander, On 08/25/2015 08:51 AM, Alexander Belopolsky wrote: > On Tue, Aug 25, 2015 at 9:30 AM, Stuart Bishop > wrote: > > As mentioned elsewhere, pytz requires strict checking to remain > backwards compatible. > > > Can you provide the specific examples where strict checking is > required? I mentioned my most recent use case for pytz's strict checking in another thread, but I did not describe it thoroughly and you may have missed it. I'll try to do better here, in case it's helpful. The system is a calendaring application that allows users to request an appointment in any open slot, where slots are fixed half-hour blocks. A day is displayed as a column of half-hour slots (48 per day). Some slots are open and available for the user to click and request an appointment; some are not. The system is multi-user and timezone-aware; different users may be in different timezones and must be able to schedule appointments with one another. To generate a day's calendar to display to the user, I `combine()` the day's date and the start/end time for each slot into a naive datetime, and `pytz.localize()` that naive datetime into the timezone of the user viewing the calendar. Those start/end datetimes, and information about the availability status of the slot, are included in a small data structure for each slot. This data structure is made available to the client-side (Javascript) code so it can display the slots with appropriate styling, and make the right appointment-request to the server if a user clicks an available slot. (Now, I'm sure there are various ways this design could be critiqued and explored; I'd personally find that interesting and enlightening, but I don't think it'd be useful to this thread. For the purposes of the thread, the question is whether the design is at least reasonable and functional. It is currently working well in production, and I believe it is reasonable.) Given the unlikelihood of someone trying to schedule an appointment at 2am, I didn't care much about the specific choice of user experience for the DST gaps and folds; my overriding concerns were to a) not crash the application when displaying a DST-transition day, and b) minimize the harm done to my codebase (in fragility or added complexity) for the sake of handling this rare edge case. Options I considered: 1) Just set the `is_dst` flag to either True or False always. In a fold, this inevitably results in at least one slot which ends before it starts. My slot data structure was built to raise an exception in that scenario. I didn't want to remove this useful sanity check just for this rare case. Also in a gap, this could result in apparently hour-and-a-half long slots, which introduced complexity on the display side. 2) Avoid "ends before it starts" by figuring out which value of `is_dst` meant "later" or "earlier" in each case and always using the earlier choice for "start" and the later choice for "end." Translating between `is_dst` and "earlier/later" isn't trivial to do generically, so I'd have to localize all datetimes twice. This solution also results in overlapping slots, which (without further changes to the design) could in some cases result in a single existing appointment being displayed to the user twice on the calendar. 3) Consider any slot with an invalid or ambiguous start or end time to be an invalid slot and always display it to the user as "unavailable", thus preventing any appointment from ever being requested in that slot. I implemented this by setting `is_dst=None` and catching `InvalidTimeError`, and it turned out to be a very simple solution to the problem. The effect is an always-unavailable couple of slots right around 2am twice a year, which is totally fine, and better than e.g. double-displaying appointments. I went through this exact progression of options and settled on the third one, and was quite happy with its simplicity and lack of cascading effects on the rest of the code. I'm sure I could have found other workable options, but it seems to me that at least this indicates that there are reasonable solutions to some problems where "strict checking" is useful. I'm aware that PEP 495 as it stands would let me write a function to do the strict checking myself (at the cost of localizing all my datetimes twice, I think). I could probably live with that, though the ugliness of it might have pushed me in a different direction, where the simplicity of `is_dst=None` in pytz made option 3 attractive. In any case, I don't feel strongly enough to argue any more about this (especially since I trust Stuart to maintain a backwards-compatible pytz API regardless -- thanks Stuart!); I'm just providing a real-world use case for strict checking, since you asked for one. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From stuart at stuartbishop.net Tue Aug 25 21:37:37 2015 From: stuart at stuartbishop.net (Stuart Bishop) Date: Wed, 26 Aug 2015 02:37:37 +0700 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: References: Message-ID: On 26 August 2015 at 01:42, Alexander Belopolsky wrote: > Speaking of DBAs, how would she feel if a system zoneinfo upgrade made her > database unreadable? A zoneinfo upgrade can easily create new gaps and > folds even in the past if someone at IANA decides that they found a better > source of historical information. Speaking as a DBA, I know that my timestamps with timezones are converted and stored as UTC time on entry so this doesn't happen. You use timestamp with timezone to store an absolute time. It does, however, mean that revisions to the IANA database may change the result when it is rendered back in its original timezone. If you don't want that, you actually want a timestamp without timezone (naive), and maybe a physical location (because the local government may declare that this year they will central time instead of mountain time). -- Stuart Bishop http://www.stuartbishop.net/ From carl at oddbird.net Tue Aug 25 21:39:52 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Aug 2015 13:39:52 -0600 Subject: [Datetime-SIG] PEP 495 Q & A In-Reply-To: <55DBB82A.9040901@oddbird.net> References: <55DBB82A.9040901@oddbird.net> Message-ID: <55DCC488.6020804@oddbird.net> Another possible name for the flag/index just occurred to me: what about `which`? It doesn't read quite right for a boolean, but it reads very nicely if the attribute is in fact an integer index. And I think it describes the _function_ of the flag (to pick _which_ of possibly-multiple ambiguous datetimes to choose) more clearly than any of the other proposed options. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From carl at oddbird.net Tue Aug 25 21:42:53 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Aug 2015 13:42:53 -0600 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: <55DCC3CE.7070200@oddbird.net> References: <55DCC3CE.7070200@oddbird.net> Message-ID: <55DCC53D.4030400@oddbird.net> On 08/25/2015 01:36 PM, Carl Meyer wrote: ... > 1) Just set the `is_dst` flag to either True or False always. In a fold, > this inevitably results in at least one slot which ends before it > starts. My slot data structure was built to raise an exception in that > scenario. I didn't want to remove this useful sanity check just for this > rare case. ... Note that this represents a case where the default "just guess" behavior would actually result in the worst possible outcome: a crash that only occurs on transition days. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Tue Aug 25 21:46:19 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 15:46:19 -0400 Subject: [Datetime-SIG] PEP 495 Q & A In-Reply-To: <55DCC488.6020804@oddbird.net> References: <55DBB82A.9040901@oddbird.net> <55DCC488.6020804@oddbird.net> Message-ID: On Tue, Aug 25, 2015 at 3:39 PM, Carl Meyer wrote: > > Another possible name for the flag/index just occurred to me: what about > `which`? That was in my very first proposal: """ In other words, instead of localtime(dt, isdst=-1), we may want localtime(dt, which=0) where "which" is used to resolve the ambiguity: "which=0" means return the first (in UTC order) of the two times and "which=1" means return the second. (In the non-ambiguous cases "which" is ignored.) """ -- https://mail.python.org/pipermail/python-dev/2015-April/139099.html The name did not catch up. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Tue Aug 25 21:48:30 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Aug 2015 13:48:30 -0600 Subject: [Datetime-SIG] PEP 495 Q & A In-Reply-To: References: <55DBB82A.9040901@oddbird.net> <55DCC488.6020804@oddbird.net> Message-ID: <55DCC68E.8030903@oddbird.net> On 08/25/2015 01:46 PM, Alexander Belopolsky wrote: > > On Tue, Aug 25, 2015 at 3:39 PM, Carl Meyer > wrote: >> >> Another possible name for the flag/index just occurred to me: what about >> `which`? > > That was in my very first proposal: > > """ > In other words, instead of localtime(dt, isdst=-1), we may want > localtime(dt, which=0) where "which" is used to resolve the ambiguity: > "which=0" means return the first (in UTC order) of the two times and > "which=1" means return the second. (In the non-ambiguous cases "which" is > ignored.) > """ -- https://mail.python.org/pipermail/python-dev/2015-April/139099.html > > The name did not catch up. Ha! Well in that case, consider this a vote of confidence in your intuition -- I think it's the best of the options that have been discussed. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From ischwabacher at wisc.edu Tue Aug 25 21:56:38 2015 From: ischwabacher at wisc.edu (ISAAC J SCHWABACHER) Date: Tue, 25 Aug 2015 19:56:38 +0000 Subject: [Datetime-SIG] PEP 495 Q & A In-Reply-To: <55DCC68E.8030903@oddbird.net> References: <55DBB82A.9040901@oddbird.net> <55DCC488.6020804@oddbird.net> <55DCC68E.8030903@oddbird.net> Message-ID: I like it. It's obvious from the field name what problem it solves, and which value of the flag corresponds to which instant in time. ________________________________________ From: Datetime-SIG on behalf of Carl Meyer Sent: Tuesday, August 25, 2015 14:48 To: Alexander Belopolsky Cc: datetime-sig Subject: Re: [Datetime-SIG] PEP 495 Q & A On 08/25/2015 01:46 PM, Alexander Belopolsky wrote: > > On Tue, Aug 25, 2015 at 3:39 PM, Carl Meyer > wrote: >> >> Another possible name for the flag/index just occurred to me: what about >> `which`? > > That was in my very first proposal: > > """ > In other words, instead of localtime(dt, isdst=-1), we may want > localtime(dt, which=0) where "which" is used to resolve the ambiguity: > "which=0" means return the first (in UTC order) of the two times and > "which=1" means return the second. (In the non-ambiguous cases "which" is > ignored.) > """ -- https://mail.python.org/pipermail/python-dev/2015-April/139099.html > > The name did not catch up. Ha! Well in that case, consider this a vote of confidence in your intuition -- I think it's the best of the options that have been discussed. Carl From alexander.belopolsky at gmail.com Tue Aug 25 22:10:08 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 16:10:08 -0400 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: <55DCC3CE.7070200@oddbird.net> References: <55DCC3CE.7070200@oddbird.net> Message-ID: On Tue, Aug 25, 2015 at 3:36 PM, Carl Meyer wrote: > Given the unlikelihood of someone trying to schedule an appointment at > 2am, I didn't care much about the specific choice of user experience for > the DST gaps and folds; my overriding concerns were to a) not crash the > application when displaying a DST-transition day, and b) minimize the > harm done to my codebase (in fragility or added complexity) for the sake > of handling this rare edge case. > Here is my recommendation for your case. Assuming that taking user entry is equivalent to parsing date and time string in some format and scheduling is equivalent to writing a seconds since epoch timestamp into some database, do the following post-PEP 495: dt = datetime.strptime(input, format) s = dt.timestamp() schedule(s) # optionally sdt = datetime.fromtimestamp(s) if dt != sdt: warn("You specified an invalid time %s, we scheduled your appointment for %s", dt, sdt) Instead of a warning, your program may highlight the changed hour in read. This program will have no means of scheduling in the "fold=1" hour and will correct you if you enter time that falls in a gap. The PEP 495 design is motivated by precisely this type of application. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Tue Aug 25 22:13:09 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Aug 2015 14:13:09 -0600 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: References: <55DCC3CE.7070200@oddbird.net> Message-ID: <55DCCC55.2040908@oddbird.net> On 08/25/2015 02:10 PM, Alexander Belopolsky wrote: > > On Tue, Aug 25, 2015 at 3:36 PM, Carl Meyer > wrote: > > Given the unlikelihood of someone trying to schedule an appointment at > 2am, I didn't care much about the specific choice of user experience for > the DST gaps and folds; my overriding concerns were to a) not crash the > application when displaying a DST-transition day, and b) minimize the > harm done to my codebase (in fragility or added complexity) for the sake > of handling this rare edge case. > > > Here is my recommendation for your case. Assuming that taking user > entry is equivalent to parsing date and time string in some format and > scheduling is equivalent to writing a seconds since epoch timestamp into > some database, do the following post-PEP 495: > > dt = datetime.strptime(input, format) > s = dt.timestamp() > schedule(s) > # optionally > sdt = datetime.fromtimestamp(s) > if dt != sdt: > warn("You specified an invalid time %s, we scheduled your > appointment for %s", dt, sdt) > > Instead of a warning, your program may highlight the changed hour in read. > > This program will have no means of scheduling in the "fold=1" hour and > will correct you if you enter time that falls in a gap. The PEP 495 > design is motivated by precisely this type of application. You are missing the crux of my use case, which is that I need to generate a calendar to display to the user, with all the half-hour slots from midnight to midnight for one day in it (and I need the actual timezone-aware instants of the start and end time for each slot so that I can properly populate its availability status based on a database table of existing appointments, with start and end times stored in UTC). So your solution, which is all about "taking user entry," is not relevant to the use case. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Tue Aug 25 22:24:58 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 16:24:58 -0400 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: <55DCCC55.2040908@oddbird.net> References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> Message-ID: On Tue, Aug 25, 2015 at 4:13 PM, Carl Meyer wrote: > You are missing the crux of my use case, which is that I need to > generate a calendar to display to the user, with all the half-hour slots > from midnight to midnight for one day in it > Got it. How is this then start = datetime.combine(date, time(0)).astimezone() while True: end = (start + timedelta(hours=0.5)).astimezone() print(start, end) start = end if end.time() == time(0): break -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Tue Aug 25 22:30:32 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Aug 2015 14:30:32 -0600 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> Message-ID: <55DCD068.9020604@oddbird.net> On 08/25/2015 02:24 PM, Alexander Belopolsky wrote: > > On Tue, Aug 25, 2015 at 4:13 PM, Carl Meyer > wrote: > > You are missing the crux of my use case, which is that I need to > generate a calendar to display to the user, with all the half-hour slots > from midnight to midnight for one day in it > > > Got it. How is this then > > start = datetime.combine(date, time(0)).astimezone() > while True: > end = (start + timedelta(hours=0.5)).astimezone() > print(start, end) > start = end > if end.time() == time(0): > break On DST transition days, this code does not generate exactly 48 slots, displayable to the user in a schedule that includes all hours from 0 to 23 labeled exactly once. That introduces an unacceptable level of additional display-layer complexity, and remains inferior to the "strict checking" solution I chose. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Tue Aug 25 22:30:43 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 16:30:43 -0400 Subject: [Datetime-SIG] PEP 495 Q & A In-Reply-To: References: <55DBB82A.9040901@oddbird.net> <55DCC488.6020804@oddbird.net> <55DCC68E.8030903@oddbird.net> Message-ID: On Tue, Aug 25, 2015 at 3:56 PM, ISAAC J SCHWABACHER wrote: > I like it. It's obvious from the field name what problem it solves, and > which value of the flag corresponds to which instant in time. Which problem "which" solves? Obvious! It tells you who is on first. (Answer: not "who" - "which"!) -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Tue Aug 25 22:33:57 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Aug 2015 14:33:57 -0600 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: <55DCD068.9020604@oddbird.net> References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> Message-ID: <55DCD135.8090801@oddbird.net> On 08/25/2015 02:30 PM, Carl Meyer wrote: > On 08/25/2015 02:24 PM, Alexander Belopolsky wrote: >> >> On Tue, Aug 25, 2015 at 4:13 PM, Carl Meyer > > wrote: >> >> You are missing the crux of my use case, which is that I need to >> generate a calendar to display to the user, with all the half-hour slots >> from midnight to midnight for one day in it >> >> >> Got it. How is this then >> >> start = datetime.combine(date, time(0)).astimezone() >> while True: >> end = (start + timedelta(hours=0.5)).astimezone() >> print(start, end) >> start = end >> if end.time() == time(0): >> break > > On DST transition days, this code does not generate exactly 48 slots, > displayable to the user in a schedule that includes all hours from 0 to > 23 labeled exactly once. > > That introduces an unacceptable level of additional display-layer > complexity, and remains inferior to the "strict checking" solution I chose. To further explain this requirement: days can be displayed adjacent to each other in a weekly view, so the server must provide to the client exactly 48 slot data structures per day (including DST transition days), which can be laid out in a strict grid, where all the days share the same set of time labels on the left. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Tue Aug 25 22:42:35 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 16:42:35 -0400 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: <55DCD068.9020604@oddbird.net> References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> Message-ID: On Tue, Aug 25, 2015 at 4:30 PM, Carl Meyer wrote: > > start = datetime.combine(date, time(0)).astimezone() > > while True: > > end = (start + timedelta(hours=0.5)).astimezone() > > print(start, end) > > start = end > > if end.time() == time(0): > > break > > On DST transition days, this code does not generate exactly 48 slots, > displayable to the user in a schedule that includes all hours from 0 to > 23 labeled exactly once. > > That introduces an unacceptable level of additional display-layer > complexity, and remains inferior to the "strict checking" solution I chose. OK. So you just want [datetime.combine(date, time(0)) + timedelta(hours=0.5*i)) for i in range(48)] but detect those that are either in the fold or in the gap? Just check for dt.astimezone() != dt.replace(fold=1).astimezone() instead of catching the strict exceptions. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Tue Aug 25 22:48:25 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Aug 2015 14:48:25 -0600 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> Message-ID: <55DCD499.6020801@oddbird.net> On 08/25/2015 02:42 PM, Alexander Belopolsky wrote: > > On Tue, Aug 25, 2015 at 4:30 PM, Carl Meyer > wrote: > > > start = datetime.combine(date, time(0)).astimezone() > > while True: > > end = (start + timedelta(hours=0.5)).astimezone() > > print(start, end) > > start = end > > if end.time() == time(0): > > break > > On DST transition days, this code does not generate exactly 48 slots, > displayable to the user in a schedule that includes all hours from 0 to > 23 labeled exactly once. > > That introduces an unacceptable level of additional display-layer > complexity, and remains inferior to the "strict checking" solution I > chose. > > OK. So you just want [datetime.combine(date, time(0)) + > timedelta(hours=0.5*i)) for i in range(48)] but detect those that are > either in the fold or in the gap? Just check for dt.astimezone() != > dt.replace(fold=1).astimezone() instead of catching the strict exceptions. Yes, I know :-) If you'd read my email to the end, you'd have seen that I already know that that's the PEP 495 solution to my problem, and that it's probably fine if wrapped up in a utility function, though the ugliness of having to do all timezone localizations twice might have caused me to spend another couple hours looking for a better solution, whereas the convenience, availability, and discoverability of pytz's `is_dst=None` option made things simple and easy for me. I'm glad you've at least reached the point of acknowledging that valid use cases for strict checking exist; this seems to be a new development :-) That was really my only intent here. If you accept this, and still think that the best API datetime can provide is "localize twice and figure it out yourself", I'm not interested in arguing the point. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Tue Aug 25 22:56:40 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 16:56:40 -0400 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: <55DCD135.8090801@oddbird.net> References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> <55DCD135.8090801@oddbird.net> Message-ID: On Tue, Aug 25, 2015 at 4:33 PM, Carl Meyer wrote: > To further explain this requirement: days can be displayed adjacent to > each other in a weekly view, so the server must provide to the client > exactly 48 slot data structures per day (including DST transition days), > which can be laid out in a strict grid, where all the days share the > same set of time labels on the left. > Whatever your requirements are, the difference between "strict checking" and PEP 495 code will be the same as between dt = get_naive_time() try: dt = dt.astimezone() except AmbiguousTimeError: # do something except MissingTimeError: # do something else else: # use dt and dt = get_naive_time() dt0 = dt.replace(fold=0).astimezone() dt1 = dt.replace(fold=1).astimezone() if dt1 > dt0: # do something elif dt1 < dt0: # do something else else: # use dt0 which is the same as dt1 The PEP 495 code may be slightly more involved, but most people can get away with just dt = get_time().astimezone() and not see any crashes. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Tue Aug 25 23:01:12 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Aug 2015 15:01:12 -0600 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> <55DCD135.8090801@oddbird.net> Message-ID: <55DCD798.6060102@oddbird.net> On 08/25/2015 02:56 PM, Alexander Belopolsky wrote: ... > > The PEP 495 code may be slightly more involved, but most people can get > away with just > > dt = get_time().astimezone() > > and not see any crashes. Since nobody has argued (that I can recall) for "raise an exception" to be the _default_ behavior, this last point will be true regardless. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Tue Aug 25 23:01:38 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 17:01:38 -0400 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: <55DCD499.6020801@oddbird.net> References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> <55DCD499.6020801@oddbird.net> Message-ID: On Tue, Aug 25, 2015 at 4:48 PM, Carl Meyer wrote: > If you accept this, and still > think that the best API datetime can provide is "localize twice and > figure it out yourself", I'm not interested in arguing the point. > Trust me, I considered many other options, but I have 4000+ lines of datetime unit tests that I cannot break. If you can come up with a patch that does what you want and passes python -m test test_datetime - I would certainly like to see it. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Tue Aug 25 23:04:10 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Aug 2015 15:04:10 -0600 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> <55DCD499.6020801@oddbird.net> Message-ID: <55DCD84A.5060006@oddbird.net> On 08/25/2015 03:01 PM, Alexander Belopolsky wrote: > On Tue, Aug 25, 2015 at 4:48 PM, Carl Meyer > wrote: > > If you accept this, and still > think that the best API datetime can provide is "localize twice and > figure it out yourself", I'm not interested in arguing the point. > > > Trust me, I considered many other options, but I have 4000+ lines of > datetime unit tests that I cannot break. If you can come up with a > patch that does what you want and passes python -m test test_datetime - > I would certainly like to see it. I don't understand why it seems like you continue to interpret all requests for an opt-in way to raise an exception as requests to change the default, which nobody is contemplating. I can't imagine how raising an exception on invalid times only if a non-default sentinel value is given for a flag that is _new in PEP 495_ could possibly break 4000 lines of existing datetime tests. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From ischwabacher at wisc.edu Tue Aug 25 23:07:00 2015 From: ischwabacher at wisc.edu (ISAAC J SCHWABACHER) Date: Tue, 25 Aug 2015 21:07:00 +0000 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> <55DCD135.8090801@oddbird.net> Message-ID: [Alexander Belopolsky] > Whatever your requirements are, the difference between "strict checking" and PEP 495 code will be the same as between > dt = get_naive_time() > try: > ? ? dt = dt.astimezone() > except AmbiguousTimeError: > ? ? # do something > except MissingTimeError: > ? ? # do something else > else: > ? ? # use dt It's obvious to me what this code means. > and > dt = get_naive_time() > dt0 = dt.replace(fold=0).astimezone() > dt1 = dt.replace(fold=1).astimezone() > if dt1 > dt0: > ? ? # do something > elif dt1 < dt0: > ? ? # do something else > else: > ? ? # use dt0 which is the same as dt1 The only thing that's obvious to me about this code is that it's hacking around something. ijs From alexander.belopolsky at gmail.com Tue Aug 25 23:22:45 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 17:22:45 -0400 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: <55DCD84A.5060006@oddbird.net> References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> <55DCD499.6020801@oddbird.net> <55DCD84A.5060006@oddbird.net> Message-ID: On Tue, Aug 25, 2015 at 5:04 PM, Carl Meyer wrote: > I can't imagine how raising an exception on invalid times only if a > non-default sentinel value is given for a flag that is _new in PEP 495_ > could possibly break 4000 lines of existing datetime tests. > OK, so datetime module itself will never set fold=-1. In the list below, can you mark the methods that need to be patched to check the fold attribute in your preferred design: datetime.__add__ datetime.__eq__ datetime.__format__ datetime.__ge__ datetime.__gt__ datetime.__hash__ datetime.__le__ datetime.__lt__ datetime.__ne__ datetime.__new__ datetime.__radd__ datetime.__reduce__ datetime.__reduce_ex__ datetime.__repr__ datetime.__rsub__ datetime.__str__ datetime.astimezone datetime.combine datetime.ctime datetime.date datetime.dst datetime.fromordinal datetime.fromtimestamp datetime.isocalendar datetime.isoformat datetime.isoweekday datetime.now datetime.replace datetime.strftime datetime.strptime datetime.time datetime.timestamp datetime.timetuple datetime.timetz datetime.today datetime.toordinal datetime.tzname datetime.utcfromtimestamp datetime.utcnow datetime.utcoffset datetime.utctimetuple datetime.weekday Please ask Isaac and Stuart to do the same. Once you agree on a list, let's continue this discussion. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Tue Aug 25 23:28:32 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Aug 2015 15:28:32 -0600 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> <55DCD499.6020801@oddbird.net> <55DCD84A.5060006@oddbird.net> Message-ID: <55DCDE00.5080700@oddbird.net> On 08/25/2015 03:22 PM, Alexander Belopolsky wrote: > > On Tue, Aug 25, 2015 at 5:04 PM, Carl Meyer > wrote: > > I can't imagine how raising an exception on invalid times only if a > non-default sentinel value is given for a flag that is _new in PEP 495_ > could possibly break 4000 lines of existing datetime tests. > > > OK, so datetime module itself will never set fold=-1. In the list > below, can you mark the methods that need to be patched to check the > fold attribute in your preferred design: ... > Please ask Isaac and Stuart to do the same. Once you agree on a list, > let's continue this discussion. My answer is "only in those same locations where the fold attribute would otherwise be checked in order to resolve an ambiguity." That is, I wouldn't introduce any new checks: only and exactly where PEP 495 would otherwise make a guess based on the fold attribute should it raise an exception if the fold attribute is set to a "don't guess" sentinel (for which I'd prefer None to -1, since the latter invites confusion with tm_isdst). Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Tue Aug 25 23:36:48 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 17:36:48 -0400 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: <55DCDE00.5080700@oddbird.net> References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> <55DCD499.6020801@oddbird.net> <55DCD84A.5060006@oddbird.net> <55DCDE00.5080700@oddbird.net> Message-ID: On Tue, Aug 25, 2015 at 5:28 PM, Carl Meyer wrote: > > My answer is "only in those same locations where the fold attribute > would otherwise be checked in order to resolve an ambiguity." This includes utcoffset() which is used in datetime.__eq__. What should __eq__ do if utcoffset() raises AmbiguousTimeError? Unpatched, it will propagate the exception resulting in for example x in [y, x, x, x] raising an error whenever y happened to be fold=-1 ambiguous. Is your code prepared to handle AmbiguousTimeError whenever you search for a date in a list? Does it check for fold != -1 before adding a date to a list? -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Tue Aug 25 23:44:54 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Aug 2015 15:44:54 -0600 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> <55DCD499.6020801@oddbird.net> <55DCD84A.5060006@oddbird.net> <55DCDE00.5080700@oddbird.net> Message-ID: <55DCE1D6.9020509@oddbird.net> On 08/25/2015 03:36 PM, Alexander Belopolsky wrote: > > On Tue, Aug 25, 2015 at 5:28 PM, Carl Meyer > wrote: >> >> My answer is "only in those same locations where the fold attribute >> would otherwise be checked in order to resolve an ambiguity." > > > This includes utcoffset() which is used in datetime.__eq__. What should > __eq__ do if utcoffset() raises AmbiguousTimeError? Unpatched, it will > propagate the exception resulting in for example x in [y, x, x, x] > raising an error whenever y happened to be fold=-1 ambiguous. Is your > code prepared to handle AmbiguousTimeError whenever you search for a > date in a list? Does it check for fold != -1 before adding a date to a > list? This is a good question. I can see two defensible choices: 1) Sure, go ahead and propagate, and document it. Anyone choosing to use `fold=None` is responsible to ensure their aware datetimes are valid (i.e. immediately on constructing/combining/localizing them) before doing anything else, or else be prepared to catch InvalidTimeError pretty much anywhere else. This would be entirely fine with me. 2) In the specific case of `__eq__`, catch it and only consider the two datetimes equal if they both raised the same invalidity exception. But I think while (2) is tenable for `__eq__`, it doesn't have such a neat resolution for inequality comparisons and probably other cases either, so (1) is probably best. Carl P.S. There's a very strong temptation to break out the Zen of Python here, as it rarely applies so directly: "In the face of ambiguity, refuse the temptation to guess." And the discussion here isn't even contemplating following the Zen by default, only giving users a convenient option to follow the Zen! ;-) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Tue Aug 25 23:47:52 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 17:47:52 -0400 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: <55DCE1D6.9020509@oddbird.net> References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> <55DCD499.6020801@oddbird.net> <55DCD84A.5060006@oddbird.net> <55DCDE00.5080700@oddbird.net> <55DCE1D6.9020509@oddbird.net> Message-ID: On Tue, Aug 25, 2015 at 5:44 PM, Carl Meyer wrote: > On 08/25/2015 03:36 PM, Alexander Belopolsky wrote: > > > > On Tue, Aug 25, 2015 at 5:28 PM, Carl Meyer > > wrote: > >> > >> My answer is "only in those same locations where the fold attribute > >> would otherwise be checked in order to resolve an ambiguity." > > > > > > This includes utcoffset() which is used in datetime.__eq__. What should > > __eq__ do if utcoffset() raises AmbiguousTimeError? Unpatched, it will > > propagate the exception resulting in for example x in [y, x, x, x] > > raising an error whenever y happened to be fold=-1 ambiguous. Is your > > code prepared to handle AmbiguousTimeError whenever you search for a > > date in a list? Does it check for fold != -1 before adding a date to a > > list? > > This is a good question. I can see two defensible choices: Pick one and try to defend it. In the face of ambiguity ... -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Tue Aug 25 23:50:33 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Aug 2015 15:50:33 -0600 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> <55DCD499.6020801@oddbird.net> <55DCD84A.5060006@oddbird.net> <55DCDE00.5080700@oddbird.net> <55DCE1D6.9020509@oddbird.net> Message-ID: <55DCE329.5070904@oddbird.net> On 08/25/2015 03:47 PM, Alexander Belopolsky wrote: > > On Tue, Aug 25, 2015 at 5:44 PM, Carl Meyer > wrote: > > On 08/25/2015 03:36 PM, Alexander Belopolsky wrote: > > > > On Tue, Aug 25, 2015 at 5:28 PM, Carl Meyer > > >> wrote: > >> > >> My answer is "only in those same locations where the fold attribute > >> would otherwise be checked in order to resolve an ambiguity." > > > > > > This includes utcoffset() which is used in datetime.__eq__. What should > > __eq__ do if utcoffset() raises AmbiguousTimeError? Unpatched, it will > > propagate the exception resulting in for example x in [y, x, x, x] > > raising an error whenever y happened to be fold=-1 ambiguous. Is your > > code prepared to handle AmbiguousTimeError whenever you search for a > > date in a list? Does it check for fold != -1 before adding a date to a > > list? > > This is a good question. I can see two defensible choices: > > Pick one and try to defend it. In the face of ambiguity ... I already did that. "while (2) is tenable for `__eq__`, it doesn't have such a neat resolution for inequality comparisons and probably other cases either, so (1) is probably best." and earlier in (1), "This would be entirely fine with me." Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Wed Aug 26 00:37:03 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 18:37:03 -0400 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: <55DCE329.5070904@oddbird.net> References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> <55DCD499.6020801@oddbird.net> <55DCD84A.5060006@oddbird.net> <55DCDE00.5080700@oddbird.net> <55DCE1D6.9020509@oddbird.net> <55DCE329.5070904@oddbird.net> Message-ID: On Tue, Aug 25, 2015 at 5:50 PM, Carl Meyer wrote: > > > This is a good question. I can see two defensible choices: > > > > Pick one and try to defend it. In the face of ambiguity ... > > I already did that. > > "while (2) is tenable for `__eq__`, it doesn't have such a > neat resolution for inequality comparisons and probably other cases > either, so (1) is probably best." > > and earlier in (1), "This would be entirely fine with me." Just wanted to be sure. :-) So your proposal is: [Carl Meyer] 1) Sure, go ahead and propagate, and document it. Anyone choosing to use `fold=None` is responsible to ensure their aware datetimes are valid (i.e. immediately on constructing/combining/localizing them) before doing anything else, or else be prepared to catch InvalidTimeError pretty much anywhere else. This would be entirely fine with me. [/Carl Meyer] I interpret this as allowing __new__, .replace(), combine(), etc. to create ambiguous fold=None instances, but any operation with those - comparison, arithmetics, set/dict insertion or retrieval, etc - may raise AmbiguousTimeError. I am not sure what I as a user of such datetimes am supposed to do to "ensure [my] aware datetimes are valid," but your idea clearly differs from that of Stuart [Stuart Bishop] pytz users need to optionally have exceptions raised when they try to construct an invalid or ambiguous datetime instance, directly via __new__ or indirectly with something like dt.replace(). If these methods are called with first=None, it will be passed through to dt.utcoffset() and it may raise an exception. dt.utcoffset() will only ever raise an exception when the user has explicitly requested construction of a datetime with strict checking. It will never raise an exception in normal operation, including arithmetic, and could not cause problems in the situations you cite. [/Stuart Bishop] If you understand each other and can agree on the list of datetime methods that need to be patched to support fold=None - please do so and present a joint proposal. Meanwhile, I would like everyone to realize that lossless conversion of gaps and folds is a tiny corner case in the universe of uses that the datetime module has. The goal of PEP 495 is to allow such conversion while provably not creating problems for the mainstream uses. If we allow creation of fold=None instances, all programs that use the datetime module will have to be analyzed for how they handle them. In large projects, you may not even know the person who wrote the code that produced datetime instances that the function you write today will receive tomorrow. So your choices will be either to sprinkle your code with assert dt.fold is not None statements and your documentation with the warnings that fold=None datetime instances are not supported, or be prepared to handle an exception each time you touch a datetime instance. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Aug 26 01:16:01 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 19:16:01 -0400 Subject: [Datetime-SIG] Strict Invalid Time Checking: an idea for another PEP Message-ID: Here is an outline of a "Strict Invalid Time Checking" that might work: the values of datetime.fold attribute will be restricted to 0 and 1, but the datetime constructor will accept None as a possible value of the fold argument. The datetime constructor that receives fold=None will set self.fold both ways and call self.tzinfo.utcoffset(self) twice before returning the constructed instance. If the values returned by the two utcoffset() calls match - an instance with self.fold=0 will be returned, if not - they will be compared and an appropriate error returned. This design seems workable, but immediately raises a question: shouldn't datetime constructor get the strict=False argument instead of encoding it in the third value of fold? And if we want to have datetime(..., strict=True), why not just have strict_datetime(...) function in your toolkit or on PyPI? Not every 8-line function need to be in the standard library. We can discuss this and other questions if someone decides to champion a Strict Invalid Time Checking PEP after PEP 495 is in-place. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Wed Aug 26 01:19:16 2015 From: carl at oddbird.net (Carl Meyer) Date: Tue, 25 Aug 2015 17:19:16 -0600 Subject: [Datetime-SIG] Strict Invalid Time Checking: an idea for another PEP In-Reply-To: References: Message-ID: <55DCF7F4.805@oddbird.net> On 08/25/2015 05:16 PM, Alexander Belopolsky wrote: > Here is an outline of a "Strict Invalid Time Checking" that might work: > the values of datetime.fold attribute will be restricted to 0 and 1, but > the datetime constructor will accept None as a possible value of the > fold argument. The datetime constructor that receives fold=None will > set self.fold both ways and call self.tzinfo.utcoffset(self) twice > before returning the constructed instance. If the values returned by > the two utcoffset() calls match - an instance with self.fold=0 will be > returned, if not - they will be compared and an appropriate error returned. > > This design seems workable, but immediately raises a question: shouldn't > datetime constructor get the strict=False argument instead of encoding > it in the third value of fold? > > And if we want to have datetime(..., strict=True), why not just have > strict_datetime(...) function in your toolkit or on PyPI? Not every > 8-line function need to be in the standard library. > > We can discuss this and other questions if someone decides to champion > a Strict Invalid Time Checking PEP after PEP 495 is in-place. Works for me. You've convinced me that it's a subtle enough problem to deserve its own PEP. And since it would likely involve adding either a new argument to some methods/constructors, or a new (invalid under PEP 495) value to the disambiguation flag, there's no sense in which it needs to be done at the same time; PEP 495 won't restrict future options. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From 4kir4.1i at gmail.com Wed Aug 26 01:19:30 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Wed, 26 Aug 2015 02:19:30 +0300 Subject: [Datetime-SIG] how does PEP-495 help improve dateutil, pytz timezone packages? In-Reply-To: (Alexander Belopolsky's message of "Tue, 25 Aug 2015 13:08:44 -0400") References: Message-ID: <87lhcz7y7x.fsf@gmail.com> Alexander Belopolsky writes: > I am changing the subject line because neither of the PEPs mentioned in the > original subject propose any changes to the time module. This list is about improving datetime module, in particular PEP-495. I've changed the subject accordingly. As long as datetime module uses time module, the corresponding time module issues that cannot be worked-around *are* datetime module issues. It seems blatantly obvious. If you don't consider the tz_ database to be essential for writing the code that works with timezones; you may stop reading now. .. _tz: https://www.iana.org/time-zones/repository/tz-link.html What is discussed here ? ------------------------ It seems there is a communication problem. Let's overcommunicate then :) The time module issues are mentioned to explain why it is not reasonable to expect datetime.now(timezone.utc).astimezone() to work in the general case. stdlib's astimezone() is mentioned to point out that it may fail while pytz works in the exact same case. I want to demonstrate that pytz works in cases where stdlib and dateutil fail currently to point out that *PEP-495 should either provide more support for the way pytz works or demonstrate how PEP-495 fixes design issues in stdlib and dateutil that make it difficult to enable better timezone support.* Why PEP-495 -- Local Time Disambiguation should care about zoneinfo? -------------------------------------------------------------------- History shows that the current datetime API is at least partially responsible that the only working solution (pytz) has more complicated API. *it works but it might have been simpler and less error-prone.* The same could be said about stdlib, dateutil, and the timezone packages that are built on top of them such as arrow, delorean. The difference is that they work in less cases (fail more). Therefore even if the explicit goal of PEP-495 is different from PEP-431; PEP-495 should avoid making the life more difficult for zoneinfo packages or even more: it should consider *how it can help pytz, dateutil, or some other timezone package to provide a good tzdata-related API.* The last part is the reason I've mentioned cases where stdlib, dateutil fail in this thread. What are possible good timezone API examples? --------------------------------------------- >From the _minimalistic_ category: times_ Python package -- utc/posix time internally, local time is used only for input or display (similar to Unicode sandwich approach: Unicode internally, use bytes only if necessary to communicate with the outside world). No implicit timezone conversions. It is unfortunately no longer supported. It is implemented on top of arrow_ which (last time I've checked) has the same issues as dateutil. >From the _kitchen-sink_ category: Time4J_ Java package -- a few composable primitives provide powerful API. Notable feature: no temporal arithmetic or manipulations for ZonalDateTime_. .. _times: https://github.com/nvie/times/ .. _arrow: http://arrow.readthedocs.org .. _Time4J: https://github.com/MenoData/Time4J .. _ZonalDateTime: http://www.time4j.net/tutorial/zdt.html What are examples of timezone-related issues that PEP-495 could solve? ---------------------------------------------------------------------- - utc -> local timezone conversions in dateutil. I haven't looked at the source but Stuart Bishop_ says that the new flag may fix this and perhaps other issues caused by ambiguous times - datetime constructor method might start working with pytz timezones. The general goal is to leave pytz localize() method only for those people who need an exception for ambiguous or non-existent times. The important part is that PEP-495 should not make it even more difficult to use the packages correctly. Ideally, PEP-495 should evolve with the corresponding experimental implementations that adapt the new flag. .. _Bishop: https://mail.python.org/pipermail/datetime-sig/2015-August/000466.html > On Tue, Aug 25, 2015 at 11:47 AM, Akira Li <4kir4.1i at gmail.com> wrote: >> >> Alexander Belopolsky writes: >> >> >> On Aug 25, 2015, at 7:44 AM, Akira Li <4kir4.1i at gmail.com> wrote: >> >> >> >> note: stdlib variant datetime.now(timezone.utc).astimezone() may fail > if it >> >> uses time.timezone, time.tzname internally [3,4,5] when tm_gmtoff >> >> tm_zone are not available on a given platform. >> > >> > If this actually happens on any supported platform - please file a bug >> > report. What we do in this case is not as simplistic as you describe. >> >> Bug-driven development is probably not the best strategy for a datetime >> library ;) Tests can't catch all bugs. I've found out that astimezone() >> may fail by *reading* its source and trying to *understand* what it does. > > > I agree, but once you've read the code and see any logical errors, you > should be able to construct a test case demonstrating wrong behavior. I did. >> >> Here's the part from datetime.py [1] that computes the local timezone if >> tm_gmtoff or tm_zone are not available: >> >> # Compute UTC offset and compare with the value implied >> # by tm_isdst. If the values match, use the zone name >> # implied by tm_isdst. >> delta = local - datetime(*_time.gmtime(ts)[:6]) >> dst = _time.daylight and localtm.tm_isdst > 0 >> gmtoff = -(_time.altzone if dst else _time.timezone) >> if delta == timedelta(seconds=gmtoff): >> tz = timezone(delta, _time.tzname[dst]) >> else: >> tz = timezone(delta) >> >> Here's its C equivalent [2]. >> >> Python issues that I've linked in the previous message [3,4,5] demonstrate >> that time.timezone and time.tzname may have wrong values and therefore >> the result *tz* may have a wrong tzname. > > > To summarize for those who will not follow the links: [3] Is a closed "No > obvious and correct way to get the time zone offset" issue. It was > superseded by which in turn was closed > by implementing the argument-less .astimezone() method. [4] and [5] are > time module issues. Look at the code example immediately above the text you are commenting on. Look at _time.tzname, _time.timezone there. It is the code from datetime.astimezone() method. If timezone, tzname may be wrong then astimezone() may also fail. The example below demonstrates the failure. The issues that I've linked demonstrate specific cases when timezone, tzname are wrong. The status of the issues is irrelevant (timezone, tzname behavior hasn't changed). >> >> Here's an example inspired by >> "incorrect time.timezone value" Python issue [4]: >> >> >>> from datetime import datetime, timezone >> >>> from email.utils import parsedate_to_datetime >> >>> import tzlocal # to get local timezone as pytz timezone >> >>> d = parsedate_to_datetime("Tue, 28 Oct 2013 14:27:54 +0000") >> >>> # expected (TZ=Europe/Moscow) >> ... >> >>> d.astimezone(tzlocal.get_localzone()).strftime('%Z%z') >> 'MSK+0400' >> >>> # got >> ... >> >>> d.astimezone().strftime('%Z%z') >> 'UTC+04:00+0400' >> > > I don't understand why you keep presenting a mix of pytz, email.utils and > something called "tzlocal" and then claim that the unexpected behavior > indicates a problem in the datetime module? It could as well be in any of > the three other modules that you use or in the way you combine them. *"something called "tzlocal""* >>> import tzlocal # to get local timezone as pytz timezone really, neither the comment ^^^ nor the code example d.astimezone(tzlocal.get_localzone()) itself told you nothing :) The purpose is to demonstrate that pytz works without relying on tm_gmtoff, tm_zone attributes while at the same time astimezone() fails here. Your own code below produces MSK+0400 that implies that you do know that it is the correct answer even if it weren't obvious just by looking at the result strings. I don't understand how you could even suggest that MSK+0400 is wrong and UTC+04:00+0400 is the correct behavior here. Here's a distilled example: >>> from datetime import datetime, timezone >>> datetime(2013, 10, 28, tzinfo=timezone.utc).astimezone().strftime('%Z%z') If you *disable tm_gmtoff attribute* then it produces UTC+04:00+0400. That differs from the expected output MSK+0400, like the same code demonstrates if you enable the attribute. Notice (direct quote): "if tm_gmtoff or tm_zone are not available" above. > If you want to parse the string "Tue, 28 Oct 2013 14:27:54 +0000" and > convert it to Moscow time, here is how you do it using the datetime module: > >>>> import os; os.environ['TZ'] = 'Europe/Moscow' >>>> from datetime import datetime >>>> d = datetime.strptime("Tue, 28 Oct 2013 14:27:54 +0000", "%a, %d %b %Y > %H:%M:%S %z") >>>> d.astimezone().strftime("%F %T %Z%z") > '2013-10-28 18:27:54 MSK+0400' > > Does this code behave differently on your system? If it does - please file > a bug report. My mistake, I should have made it even more clear that the example illustrates the results of the code from stdlib immediately above it and therefore the tm_gmtoff, tm_zone access is disabled. Try your code making sure that tm_zone is not used. >> >> 'UTC+04:00' instead of 'MSK' is not a major issue. I don't consider it a >> bug because without access to the tz database stdlib can't do much >> better, there always be cases when it breaks. > > > It is quite possible that that such cases exist, but you have not > demonstrated one. > >> >> I just use pytz instead which does provide access to the tz database. > > > This will always be your option as it is your option to use just the > datetime module. In both cases you can write correct code if you follow > the reference manual or buggy code if you don't. An almost sure way to > write buggy code is to use one library manual to write code using another. No. You can't write the correct code that works with timezones using only stdlib e.g., %Z support http://bugs.python.org/issue22377 >>> from datetime import datetime >>> datetime.strptime('2016-12-04 08:00:00 EST', '%Y-%m-%d %H:%M:%S %Z') Traceback (most recent call last): ... ValueError: ... dateutil allows to disambiguate the timezone abbreviation and returns an aware datetime in this case: >>> dateutil.parser.parse('2016-12-04 08:00:00 EST', tzinfos={'EST':-18000}) datetime.datetime(2016, 12, 4, 8, 0, tzinfo=tzoffset('EST', -18000)) --- 'UTC+04:00+0400' is not a bug like it is not a bug that a 8-bit Windows codepage can't encode all Unicode characters -- it can't and you don't expect it -- you just use the encoding such as utf-8 that does support the whole Unicode range. I don't expect datetime code that uses time.timezone, time.tzname internally (as the excerpt from datetime.py above demonstrates) to do timezone conversions without issues. Again, the purpose of the example is to demonstrate the *fundamental* deficiency in datetime module that can't be fixed without access to the tz database (tm_gmtoff is a way to get such access for a local timezone). >> >> >> [1] > https://github.com/python/cpython/blob/fced0e12fc510e4a6158628695774ccfd02395d3/Lib/datetime.py#L1513-L1522 >> [2] > https://github.com/python/cpython/blob/fced0e12fc510e4a6158628695774ccfd02395d3/Modules/_datetimemodule.c#L4721-L4735 >> [3] http://bugs.python.org/issue1647654 >> [4] http://bugs.python.org/issue22752 >> [5] http://bugs.python.org/issue22798 From 4kir4.1i at gmail.com Wed Aug 26 01:24:03 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Wed, 26 Aug 2015 02:24:03 +0300 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: (Alexander Belopolsky's message of "Tue, 25 Aug 2015 14:49:21 -0400") References: Message-ID: <87k2sj7y0c.fsf@gmail.com> Alexander Belopolsky writes: > On Tue, Aug 25, 2015 at 1:56 PM, Stuart Bishop > wrote: > >> What would be even nicer is if users didn't have to use localize at all: >> >> >> datetime.now(tz=pytz.timezone('US/Eastern')) >> > > This is certainly one of the main goals of PEP 495. Note that > datetime.now() will never produce an invalid datetime and will always set > the fold attribute correctly in the (two-)fold case if > pytz.timezone('US/Eastern') follows PEP guidelines. Therefore there is no > need for the third fold value in this case. datetime.now(pytz.timezone('US/Eastern')) already works [1] It doesn't require the disambiguation flag for pytz timezones. Though the flag might be useful for dateutil timezones: datetime.now(dateutil.tz.tzlocal()) may fail [2] [1] http://stackoverflow.com/questions/31886808/when-does-datetime-nowpytz-timezone-fail [2] https://github.com/dateutil/dateutil/issues/57 From alexander.belopolsky at gmail.com Wed Aug 26 01:45:19 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 19:45:19 -0400 Subject: [Datetime-SIG] how does PEP-495 help improve dateutil, pytz timezone packages? In-Reply-To: <87lhcz7y7x.fsf@gmail.com> References: <87lhcz7y7x.fsf@gmail.com> Message-ID: On Tue, Aug 25, 2015 at 7:19 PM, Akira Li <4kir4.1i at gmail.com> wrote: > Here's a distilled example: > > >>> from datetime import datetime, timezone > >>> datetime(2013, 10, 28, > tzinfo=timezone.utc).astimezone().strftime('%Z%z') > > If you *disable tm_gmtoff attribute* then it produces UTC+04:00+0400. > That differs from the expected output MSK+0400, like the same code > demonstrates if you enable the attribute. Notice (direct quote): "if > tm_gmtoff or tm_zone are not available" above. > Of course! That's why we exposed tm_gmtoff attribute in time.time_struct on *all platfoms* IIRC. It's been a long time, by I recall that we went to some great lengths to emulate tm_gmtoff by comparing the results of localtime calls to those of gmtime. Could it be that we missed some corner cases? Sure. But your "if tm_gmtoff or tm_zone are not available" sounds like complaining that after >>> del datetime.timezone the datetime module does not support even the UTC timezone! -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Aug 26 02:04:26 2015 From: tim.peters at gmail.com (Tim Peters) Date: Tue, 25 Aug 2015 19:04:26 -0500 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: <55DCDE00.5080700@oddbird.net> References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> <55DCD499.6020801@oddbird.net> <55DCD84A.5060006@oddbird.net> <55DCDE00.5080700@oddbird.net> Message-ID: [Alexander] >> OK, so datetime module itself will never set fold=-1. In the list >> below, can you mark the methods that need to be patched to check the >> fold attribute in your preferred design: > ... >> Please ask Isaac and Stuart to do the same. Once you agree on a list, >> let's continue this discussion. [Carl] > My answer is "only in those same locations where the fold attribute > would otherwise be checked in order to resolve an ambiguity." That is, I > wouldn't introduce any new checks: only and exactly where PEP 495 would > otherwise make a guess based on the fold attribute should it raise an > exception if the fold attribute is set to a "don't guess" sentinel (for > which I'd prefer None to -1, since the latter invites confusion with > tm_isdst). [and more iterations in later messages] This is going nowhere fast ;-) Best I can tell, time errors in pytz are pretty darn predictable, typically occurring (if at all) immediately upon explicitly calling `localize()` with `is_dst=None`. I'm sure there are other ways, but that seems to be the gist of it. And pytz users like it. But somehow, in trying to translate to a 495-workalike, pytz's explicit, straightforward scheme has turned into spraying the logic all over datetime internals, to possibly raise time errors from many contexts by magic, and even in the absence of the user explicitly asking for the check _at the time_ a time error may be raised. I think it's important to note that nobody has any experience with that. So Alexander is right to be concerned about bloat, slowdowns, vagueness, and/or unintended consequences in a far more magical scheme nobody has actually used. So far as I'm concerned, this whole issue is off the table for PEP 495. It's trivial to write a "check for folds or gaps" function _given_ 495-compliant tzinfos, although only trivial for the highly informed ;-) Big deal. Show a function raising pytz-like exceptions once in the docs, and nobody else has to figure it out (with the bonus that they can alter it to do whatever they really want in oddball cases). If people are happy enough today explicitly calling a function in pytz, they should be happy enough tomorrow explicitly calling a function in Python. Life won't change for pytz users, and plain-Python users will have new possibilities. If it's truly the case that this all _could_ be built in to Python in a reasonably usable, wholly backward compatible, "magical" way, fine - then another PEP can be proposed to add it. While related, it's a different feature. It's unreasonable to delay PEP 495 for whistles it doesn't _need_ to meet its goals (which are overwhelmingly about supplying a fold/gap flag in a backward-compatible way, to fix the _fundamental_ current impossibility of doing timezone conversions correctly in all cases - anything beyond that is at best just "maybe nice to have" _in the context of_ PEP 495). So write another PEP, write a prototype, and let people kick the tires to see how it actually works out in practice. From 4kir4.1i at gmail.com Wed Aug 26 03:20:34 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Wed, 26 Aug 2015 04:20:34 +0300 Subject: [Datetime-SIG] how does PEP-495 help improve dateutil, pytz timezone packages? In-Reply-To: (Alexander Belopolsky's message of "Tue, 25 Aug 2015 19:45:19 -0400") References: <87lhcz7y7x.fsf@gmail.com> Message-ID: <87fv36976l.fsf@gmail.com> Alexander Belopolsky writes: > On Tue, Aug 25, 2015 at 7:19 PM, Akira Li <4kir4.1i at gmail.com> wrote: > >> Here's a distilled example: >> >> >>> from datetime import datetime, timezone >> >>> datetime(2013, 10, 28, >> tzinfo=timezone.utc).astimezone().strftime('%Z%z') >> >> If you *disable tm_gmtoff attribute* then it produces UTC+04:00+0400. >> That differs from the expected output MSK+0400, like the same code >> demonstrates if you enable the attribute. Notice (direct quote): "if >> tm_gmtoff or tm_zone are not available" above. >> > > Of course! That's why we exposed tm_gmtoff attribute in time.time_struct > on *all platfoms* IIRC. It's been a long time, by I recall that we went to > some great lengths to emulate tm_gmtoff by comparing the results of > localtime calls to those of gmtime. Could it be that we missed some > corner cases? Sure. But your "if tm_gmtoff or tm_zone are not available" > sounds like complaining that after > >>>> del datetime.timezone > > the datetime module does not support even the UTC timezone! Whether or not tm_gmtoff is available depends on C library. The time module documentation says [1]: .. versionchanged:: 3.3 :attr:`tm_gmtoff` and :attr:`tm_zone` attributes are available on platforms with C library supporting the corresponding fields in ``struct tm``. i.e., tm_gmtoff, tm_zone may be absent on some platforms. How is it similar to *del datetime.timezone* if whether or not the attributes are available depends on platform? I don't see how any system without a historical timezone database may support the attributes and as far as I know C library that is used by python.exe does not use the tz database on Windows. [1] https://docs.python.org/3.4/library/time.html#time.struct_time To make sure that we are talking about the same thing. Here's how I've temporarily disabled the attributes to perform the test. It *emulates* platforms with one or both attributes missing: diff --git a/Lib/datetime.py b/Lib/datetime.py --- a/Lib/datetime.py +++ b/Lib/datetime.py @@ -1507,6 +1507,7 @@ local = datetime(*localtm[:6]) try: # Extract TZ data if available + raise AttributeError gmtoff = localtm.tm_gmtoff zone = localtm.tm_zone except AttributeError: @@ -2129,7 +2130,7 @@ # pretty bizarre, and a tzinfo subclass can override fromutc() if it is. try: - from _datetime import * + from __datetime import * except ImportError: pass else: 1st line forces AttributeError exception handler to execute. 2nd line disables C code that looks identical to the pure Python code. And here's the code from the except AttributeError clause above that I've shown in earlier messages [2] (C code does the same [3]): # Compute UTC offset and compare with the value implied # by tm_isdst. If the values match, use the zone name # implied by tm_isdst. delta = local - datetime(*_time.gmtime(ts)[:6]) dst = _time.daylight and localtm.tm_isdst > 0 gmtoff = -(_time.altzone if dst else _time.timezone) if delta == timedelta(seconds=gmtoff): tz = timezone(delta, _time.tzname[dst]) else: tz = timezone(delta) note: as the example with datetime(2013, 10, 28, tzinfo=timezone.utc).astimezone().strftime('%Z%z') in TZ=Europe/Moscow demonstrates, this code [4] does not find the correct tz.tzname() and therefore it is not an appropriate replacement for *tm_zone*. The code [4] can't be fixed without data from the tz database. [2] https://mail.python.org/pipermail/datetime-sig/2015-August/000471.html [3] https://github.com/python/cpython/blob/fced0e12fc510e4a6158628695774ccfd02395d3/Modules/_datetimemodule.c#L4721-L4735 [4] https://github.com/python/cpython/blob/fced0e12fc510e4a6158628695774ccfd02395d3/Lib/datetime.py#L1513-L1522 From alexander.belopolsky at gmail.com Wed Aug 26 03:55:27 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Tue, 25 Aug 2015 21:55:27 -0400 Subject: [Datetime-SIG] how does PEP-495 help improve dateutil, pytz timezone packages? In-Reply-To: <87fv36976l.fsf@gmail.com> References: <87lhcz7y7x.fsf@gmail.com> <87fv36976l.fsf@gmail.com> Message-ID: > On Aug 25, 2015, at 9:20 PM, Akira Li <4kir4.1i at gmail.com> wrote: > > The code [4] can't be fixed without data from the tz database. .. or a working tm_zone/tm_gmtoff extension. Ok, I'll give you that even thought I thought we had some good enough work-around at least for the offset part. What does this all have to do with PEP 495? If you are just saying that it does not solve all problems related to dealing with time zones in Python - I will be first to agree with you. It solves only one problem: it enables tzinfo providers to achieve lossless conversion between time zones with varying utcoffset. That's all. If you are saying that PEP 495 is not needed if all your time zones have fixed utcoffset - again no disagreement here. If you say that lossless conversion between varying utcoffset timezones should not be supported at all - I think you will find yourself in a small minority. From 4kir4.1i at gmail.com Wed Aug 26 05:35:12 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Wed, 26 Aug 2015 06:35:12 +0300 Subject: [Datetime-SIG] how does PEP-495 help improve dateutil, pytz timezone packages? In-Reply-To: (Alexander Belopolsky's message of "Tue, 25 Aug 2015 21:55:27 -0400") References: <87lhcz7y7x.fsf@gmail.com> <87fv36976l.fsf@gmail.com> Message-ID: <87bndu90y7.fsf@gmail.com> Alexander Belopolsky writes: >> On Aug 25, 2015, at 9:20 PM, Akira Li <4kir4.1i at gmail.com> wrote: >> >> The code [4] can't be fixed without data from the tz database. > > .. or a working tm_zone/tm_gmtoff extension. Ok, I'll give you that > even thought I thought we had some good enough work-around at least > for the offset part. > > What does this all have to do with PEP 495? If you are just saying > that it does not solve all problems related to dealing with time zones > in Python - I will be first to agree with you. It solves only one > problem: it enables tzinfo providers to achieve lossless conversion > between time zones with varying utcoffset. That's all. I've described _in detail_ [1] what the connection between the datetime.astimezone() failure and PEP-495 is. Here are some takeaways: - History shows that the current datetime API is at least partially responsible [for] that the only working solution (pytz) has more complicated [then necessary] API. *It works but it might have been simpler and less error-prone* - The important part is that PEP-495 should not make it even more difficult to use the packages [pytz, dateutil] correctly - Ideally, PEP-495 should evolve with the corresponding experimental implementations [of the packages] that adapt the new flag. > If you are saying that PEP 495 is not needed if all your time zones > have fixed utcoffset - again no disagreement here. > > If you say that lossless conversion between varying utcoffset > timezones should not be supported at all - I think you will find > yourself in a small minority. I don't understand what you mean but If we are choosing minorities then I'm in this one [2]: This database (often called zoneinfo or tz) is used by several implementations, including the GNU C Library (used in GNU/Linux), Android, Firefox OS, FreeBSD, NetBSD, OpenBSD, Cygwin, DJGPP, MINIX, webOS, AIX, BlackBerry 10, iOS, Microsoft Windows, OpenVMS, Oracle Database, Oracle Solaris, and OS X. Do you mean "varying utcoffset timezones" like dateutil's tzinfo object? I've explicitly mentioned dateutil in "What are examples of timezone-related issues that PEP-495 could solve?" section [1]. It is one of those cases where it is very easy to support 98.9%. The difference and the vast majority of the work is in 1.1%. Until the PEP-495 disambiguation flag is fully integrated in pytz, dateutil; it is hard to tell whether it is worth it or it is even harmful in the end. It seems you use timezone to mean a tzinfo instance. When we are discussing the implementation of packages that provide access to the tz database, let's use the term _time zone_ as it is understood there [3,5]: Within the tz database, a time zone is any national region where local clocks have all agreed since 1970 The typical timezone id is AREA/LOCATION such as America/New_York. It is worth keeping an eye on the necessary enhancements mentioned by Tim Peters [6] and how they may interact with PEP-495. [1] https://mail.python.org/pipermail/datetime-sig/2015-August/000506.html [2] https://www.iana.org/time-zones/repository/tz-link.html [3] https://en.wikipedia.org/wiki/Tz_database [4] https://github.com/python/cpython/blob/fced0e12fc510e4a6158628695774ccfd02395d3/Lib/datetime.py#L1513-L1522 [5] https://github.com/eggert/tz/blob/2fba5e1535bfada66d99a44c558c4d0dc3371135/Theory#L19-L23 [6] https://mail.python.org/pipermail/datetime-sig/2015-August/000411.html From tim.peters at gmail.com Wed Aug 26 10:16:42 2015 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 26 Aug 2015 03:16:42 -0500 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: [Tim] >> ... >> Later you seem to say you'd prefer a 3-state flag instead, so not sure >> you really mean "boolean" here. [Stuart Bishop ] > I write Python and SQL for a living. Booleans are 3 state to me ;) Got it! Python is sooooooo behind the times :-) > In this case, I'm not fussed if the datetime instance has a 2 state or > 3 state flag. This is different to the various constructors which I > think need a 3 state flag in their arguments. True, False, None. As things seem to have progressed later, mapping pytz's explicit time checking into a more magical scheme sprayed all over the datetime internals is not straightforward, So, as I concluded elsewhere, that may or may not be done someday, but it's out of scope for PEP 495. I'm a fan of making progress ("now is better than never", where the latter was PEP 431's fate waiting for perfection on all counts). > ... > Grump. I always interpreted that documentation to mean that timezone > conversions where *my* problem as the author of the tzinfo > implementation. Conversions, yes; arithmetic, no. The tzinfo methods authors _needed_ to implement were .tzname(), .dst(), and .utcoffset(). Optionally, .fromutc(). None of those are about how arithmetic works. Your particular implementation needed to conflate the two somewhat, since you avoided "hybrid" tzinfo classes in favor of always using fixed-offset classes, which in turn meant arithmetic routinely left you with "a wrong" class for the then-current datetime value. Which was in turn repaired by needing to call .normalize() all over the place, to replace the now-possibly-wrong tzinfo. As I understand it. If so, it's fair to say that was not an anticipated kind of implementation ;-) > I thought it was a documented problem to be fixed if/when Python > ever provided more complex tzinfo implementations, and > one of the reasons it never did provide such implementations in the > first place. The inability to do conversion correctly in all cases was documented. It annoyed me a lot, because there was no expectation that it would _ever_ "be fixed". As I've often said, I considered that to be datetime's biggest flaw. But, "now is better than never" ;-) , and we ran out of time to do more - datetime had already met all its original design goals for some time, and our mutual employer at the time was understandably annoyed at continuing to pay for more development. > Classic behaviour as you describe it is a bug. Believe me, you won't get anywhere with that approach. - Classic arithmetic is the only kind that makes good sense in the "naive time" model, which _is_ datetime's model. - Timeline arithmetic is the only kind that makes good sense in the civil-time-based-on-POSIX-approximation-to-UTC model. Which is overwhelmingly the most common model among computer types (although fully understood by relatively few). - Timeline arithmetic _including_ accounting for leap seconds too is the only kind that makes good sense for how civil time (based on real-world UTC) has actually been defined for a few decades now. - It's all but certain that civil time will be redefined yet again someday, in which case only yet another kind of arithmetic will make good sense for that. So "bug" or "feature" depends on which model you have in mind. Absolute statements make no sense. Each kind of arithmetic is "a feature" for the model it intends to serve, and "a bug" for some purposes with respect to all other models. You can legitimately complain that you hate the naive time model, but you can't complain that Python's datetime arithmetic doesn't match datetime's model. > It sounds ok when you state it as 'add one day to today and you get > the same time tomorrow'. That's always rigorously so in the naive time model, and regardless of whether you're talking about 1 day, 24 hours, 1440 minutes, ... > It does not sound ok when you state it as 'add one second to now and > you will normally get now + 1 second, but sometimes you will get an > instant further in the future, and sometimes you will get an instant > in the past'. Then you have a _different_ model in mind, and you need a different arithmetic for that. Now picture, say, a scientist insisting that the arithmetic _you_ want is WRONG ;-) because it sometimes tells them that, e.g, two moments in time are 1 second apart when in _reality_ they were exactly 2 SI seconds apart (due to a leap second inserted between). The two of you simply have different models in mind. Neither is right, neither is wrong, both want the arithmetic appropriate for the model they favor, and each will be sorely disappointed if they use an arithmetic appropriate for the other's model. That said, I would have preferred it if Python's datetime had used classic arithmetic only for naive datetimes. I feared it might be endlessly confusing if an aware datetime used classic arithmetic too. I'm not sure about "endlessly" now, but it has come up more than once ;-) Far too late to change now, though. > I dispute that this is default behaviour that can't be changed. The default arithmetic can't be changed. That was settled long ago - there wasn't ever the slightest possibility of changing the default arithmetic. So, while anyone is free to dispute it, there's not much point to that ;-) If someone wants different _default_ arithmetic, they'll need to write a new datetime-ish module. > The different arithmetic only matters when you have a dst-aware aware > datetime in play, and Python has never provided any apart perhaps from > your original reference implementation (which stopped working in > 2006). You're talking about the toy classes in the datetime docs? Looks like they've been updated all along to match changes in US daylight rules, although I would have kept only _the_ most recent rules as time went by. The original docs only intended to show how a tzinfo might be implemented, picking the then-current US daylight rules. Not really "a reference", a pedagogical example, and simplified to avoid burying the essential concepts under a mountain of arbitrary details. > pytz, however, has always provided timeline arithmetic. Yes, and it's a marvel! I never expected "the problem" could be wholly solved without changing Python internals. It's a mighty hack :-) > I believe this is the most widely deployed way of obtaining dst-aware > datetime instances, Me too. > and this is the most widely expected behaviour. Eh - there are many ways to get timeline arithmetic that "almost always" work. The angst over the tiny number of fold/gap cases seems way overblown to me, but so it goes. After PEP 495, it will be easy to get always-correct timeline arithmetic for anyone who wants it, provided they can get 495-conforming tzinfo objects (more below). > If you use pytz tzinfo instances, adding 1 second always adds one second But only in the _model_ you have in mind: real-life clocks showing real-life civil time suffer from leap seconds too. You can laugh that off in _your_ apps (and I can too ;-) ), but for other apps it's dead serious. > and adding 1 day always adds 24 hours. That's also true of classic arithmetic. The meanings of "day" and "24 hours" also depend on the model in use. > While calendaring style arithmetic is useful and a valid use case, In naive time, the distinction you want to make here doesn't really exist: timedelta supports keyword arguments for all the durations that have the same meanings as durations and as "calendar periods" _in_ naive time. > it is useless if the only relative type is the day. All common units <= a week and >= a microsecond are supported by timedelta, and all work perfectly fine in the naive time model. The extent to which they work for other purposes varies by model and purpose. > You also need months and years and periodic things like 'first > sunday every month'. This is too complex to inflict its API on > people by default. Agreed! timedelta supplies only units for which there is no possible argument about behavior _in_ naive time, and left it at that. But do note that things like "first Sunday of the month" are quite easy to implement _building_ on those: you just find the 3rd Sunday of the month then subtract timedelta(weeks=2) ;-) > But pulling in dateutils relative time helpers could be nice. If there's a groundswell of demand for adding "calendar operations" to Python, I'd be in favor of inviting Gustavo to fold dateutil's calendar operations into the core. > Do systems that rely on classic behavior actually exist? Of course. A more-or-less subtle example appears later. But we already mentioned dead-obvious uses: things like "same time tomorrow" and "same time two weeks from now" are common as mud, and classic arithmetic implements them fine. So do functions building on those primitives to implement more sophisticated calendar operations. You might complain that naive time "same time tomorrow" makes no sense if someone is starting from 24 hours before what turns out to be a gap due to DST starting, but few in the real world schedule things at such times (e.g., DST transitions never occur during normal "business work hours" if, e.g;, some app postpones a business meeting a week, it's not credible that they'll ever end up in a gap by adding timedelta(weeks=1) - unless they're trying to account for leap seconds too, and the Earth's rotation speeds up "a lot", and "same time next week" ends up exactly in the missing second). > It requires someone to have explicitly chosen to use daylight savings > capable timezones, without using pytz, while at the same time relying on > classic's surprising arithmetic. Maybe systems using dateutils without > using dateutils' implementation of datetime arithmetic. ? dateutil doesn't implement arithmetic that I know of, apart from "relative deltas". It inherits Python's classic arithmetic for datetime - datetime, and datetime +/- timedelta, AFAICT. > I believe that there are many more systems out there that are broken by this > behaviour than are relying on this behaviour. I don't know. I have little code of my own that needs timezones at all; In such code as I have, classic arithmetic works fine almost all the time, because things like "same time tomorrow" are overwhelmingly the only kinds of arithmetic I want. In the very few cases I give a rip about POSIX-approximation-to-real-world durations, I'm either using naive datetimes or tzinfo=timezone.utc, or I use one-liner functions like this one ("like this" because they're so easy to write when needed I never bothered to stick 'em in a module for reuse later): def dt_add(dt, td): return dt.tzinfo.fromutc(dt + (td - dt.utcoffset())) There you go: "timeline" datetime + timedelta arithmetic about as efficiently as possible in pure Python. Note that _if_ the default changed to timeline arithmetic, this code would no longer work. The "+" there requires classic arithmetic to get the right result. Change the default, this code would break too. I find it hard to imagine I'm the only person in the world who has code similarly taking advantage of what Python actually does. Example: from datetime import datetime, timedelta from pytz.reference import Eastern turkey_in = datetime(2004, 10, 30, 15, tzinfo=Eastern) turkey_out = dt_add(turkey_in, timedelta(days=1)) print(turkey_in) print(turkey_out) Output: 2004-10-30 15:00:00-04:00 2004-10-31 14:00:00-05:00 There my end-of-DST-party giant turkey needs to stay in the smoker for exactly 24 hours. That's "1 day" to me, because I think in naive time. The function effectively converts to UTC, adds 24 hours, then converts back, but more efficiently than bothering with .astimezone() in either direction. It correctly accounts for that the end of DST "added an hour", so while I put the turkey in at 3pm Saturday I need to take it out at 2pm Sunday. Note: my dt_add 1-liner may fail in cases starting or landing on a "problem time" (fold/gap). I've never cared, because DST transitions are intentionally scheduled to occur "wee hours on a weekend", i.e. when few people are both awake and sober enough _to_ care. But, after 495 tzinfos are available, the dt_add 1-liner will always work correctly. That this implementation of timeline arithmetic _can_ screw up now has nothing to do with its code, it's inherited from the inability of pure conversion to always work right now. > I think this is a bug worth fixing rather than entrenching, before > adding any dst aware tzinfo implementations to stdlib (including > 'local'). datetime was released a dozen years ago. There's nothing it does that wasn't already thoroughly entrenched a decade ago. > ... > However... this also means the new flag on the datetime instances is > largely irrelevant to pytz. pytz' API will need to remain the same. My hope was that 495 alone would at least spare pytz's users from needing to do a `.normalize()` dance after `.astimezone()` anymore. Although I'm not clear on why it's needed even now. > Adding a timedelta to a datetime will give you a datetime in exactly > the same offset() and dst() as you started with (because pytz gives > you timeline arithmetic, where adding 24 hours actually adds 24 > hours), and you will need to fix it using the normalize method after > the fact. The is_dst bit is effectively stored on the tzinfo instance > currently in play, and having another copy on the datetime instance > unnecessary. Yes, 495 intends to repair conversion in all cases; it has no intent to do anything about arithmetic. A different PEP may address arithmetic later (well, PEP 500 already did, but it's been rejected). I won't be pushing for it, though. As above, after 495 solid timeline arithmetic is very easy to get via 1-line Python functions. Which I personally prefer to use: because I _want_ timeline arithmetic so rarely; using a named function instead makes it very clear that I'm doing something unusual (for me). Other people have different itches to scratch. But to be kinda brutal about it, _any_ catering to timeline arithmetic is misguided: it's enabling poor practices. People who need timeline arithmetic should really be working in UTC, where classic and timeline arithmetic are the same thing, and classic arithmetic runs much faster. My only use for it in a non-UTC datetime is calculating when to take the turkey out of the smoker one day per year ;-) > The new argument to the datetime constructors may be useful, if it > accepts tri-state. If the is_dst/first flag accepts True, False or > None, then pytz may be able to deprecate the localize method. If a > user calls localize(is_dst=None), AmbiguousTImeError and > NonExistantTimeError exceptions may be raised, but by default > exceptions are not raised. I would also need the opportunity to swap > in the correct fixed offset tzinfo instance for the given datetime. > (example below) > > Losing the localize method will be a huge win for pytz, as it is ugly > and causes great confusion and many identical bug reports. The other > problem, the normalize method, is less important - if you neglect to > call normalize you still get the correct instant, but it may be > reported in the incorrect timezone period (EST instead of EDT or vice > versa). There's a lot more about this in the recent "PEP-495 - Strict Invalid Time Checking" thread. > .... > I also need to continue to support timeline arithmetic. This requires > me not having a single tzinfo instance, but swapping in the correct > fixed offset tzinfo instance at the right time. Currently, this uses > the awful localize and normalize methods. Ideally, postPEP: > > >>> eastern = pytz.timezone('US/Eastern') > >>> dt = datetime(2004, 4, 3, 2, 0, 0, tzinfo=eastern) > >>> dt2 = dt + timedelta(days=1) > >>> eastern is dt.tzinfo > False > >>> dt.tzinfo is dt2.tzinfo > False Nothing in PEP 495 changes anything about arithmetic behavior. In particular, dt's tzinfo will be copied to dt2 by "+", just as it is now. _Anything_ else would break the very strict backward compatibility constraints Guido established for this PEP. > .... > If I can do this, there is no reason that pytz could not also support > 'classic' style, but I certainly wouldn't want to encourage its use as > my rant above might indicate ;) If I write documentation, it may > require some editing, localizing from en_AU to something a little more > polite. I expect pytz users who want classic arithmetic can get it already simply by not using pytz ;-) > ... > For pytz users, being able to write a function do tell if the data you > were given is broken is a step backwards. When constructing a datetime > instance with pytz, users have the choice of raising exceptions or > having pytz normalize the input. They are never given broken data (by > their definition), and there is no need to weed it out. Assuming they follow all "the rules", yes? For example, if they forget to use .localize(), etc, it seems like anything could happen. What if they use .replace()?: .combine()? Unpickle a datetime representing a missing time? Etc. I don't see that pytz has anything magical to check datetimes created by those. > ... > I think all functions that can create datetime instances will need the > new optional flag and the flag should be tri-state, defaulting to not > whine. See the "PEP-495 - Strict Invalid Time Checking" thread for more. There seems to be increasing "feature creep" here. Rewriting vast swaths of datetime internals to cater to this is at best impractical, especially compared to supplying a "check this datetime" function users who care can call when they care. Nevertheless, it's a suitable subject for a different PEP. I don't want to bog 495 down with it. If it had _stopped_ with asking for an optional check in the datetime constructor, it may have been implemented already ;-) > ... > The important bit here for pytz is that tzinfo.fromutc() may return a > datetime with a different tzinfo instance. Sorry, didn't follow that. Of course you can write your .fromutc() to return anything you want. > Also, to drop pytz' localize method I need something like > 'tzinfo.normalize(dt)', where I have the opportunity to replace > the tzinfo the user provided with the one with the correct > offset/dst info. If you're proposing a richer tzinfo interface, that's certainly out of scope for PEP 495. But I don't expect there's any possible way that PEP 495 on its own can replace all of pytz's uses for `normalize()` regardless. >>> - My argument in favour of 'is_dst' over 'first' is that this is what >>> we have in the data we are trying to load. You commonly have >>> a timestamp with a timezone abbreviation and/or offset. This can >>> easily be converted to an is_dst flag. >> You mean by using platform C library functions (albeit perhaps wrapped >> by Python)? I really missed an answer to that ;-) >>> To convert it to a 'first' flag, we need to first parse the datetime, >> I'm unclear on this. To get a datetime _at all_ the timestamp has to >> be converted to calendar notation (year, month, ...). Which is what >> I'm guessing "parse" means here. That much has to be done in any >> case. > My example is weak. I'm thinking about parsing a string like: > > 2004-10-31 01:15 EST-05:00 > > Even if you know this is US/Eastern and not Estonia, you still need to > know that for dates in October EDT is first and EST is not first, and > for dates in april EST is first and EDT is not first In April all times have first=True (or fold=0 in the latest spelling). first=False (fold=1) only occurs for the later times in a fold (during the second of the repeated hours at the end of EDT). > and you need to include a wide enough fuzz factor that future changes > to the DST rules won't break your parser. What does this have to do with datetime? So far you haven't mentioned any datetime - or pytz - operations. > But I guess a general purpose parser that cares would construct > instances 3 days before and a 3 days later and use whichever tzinfo > had the correct offset. Or just use a fixed offset tzinfo. Sorry, I'm still not grasping what "the problem" is here. In pytz, you would presumably create a datetime with an Olson-derived US/Eastern timezone. That would internally search for where 2004-10-31 06:15 (the UTC spelling of your example) lands in the list of transitions, and deduce more-or-less directly that the original time is the later of times in a fold. If you're _not_ using datetime or pytz at all, then you have no reason to _want_ to compute first/fold to begin with, right? > ... > I despair at the bug reports, questions and general confusion that > will occur if dst-aware tzinfo implementations are added to stdlib. At > the moment, it is an obscure wart despite its age. It will become an > in your face wart as soon as a tzlocal implementation is landed, and a > wart people will be angry about because they won't realize it is there > until their production system loses an hours worth of orders because > their Python app spat out an hours worth of invalid timestamps right > around Halloween sale time. But I'm drifting off into hyperbole. But entertaining hyperbole, so it's appreciated :-) After 495 is implemented, huge swaths of confusing docs can moved into an appendix, covering all the rules and reasons for why ancient tzinfo implementations didn't allow for correct conversions in all cases. And that will make room for huge swaths of new confusing docs. But there's every reason to be optimistic: even someone as old and in-the-way as me doesn't find any of this particularly confusing ;-) From alexander.belopolsky at gmail.com Wed Aug 26 16:01:35 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 26 Aug 2015 10:01:35 -0400 Subject: [Datetime-SIG] The name of the UTC timezone Message-ID: As a little diversion from the PEP wars, let me ask the group if we can fix this little wart: [1] >>> from datetime import * >>> t = datetime.now(timezone.utc) >>> print(t) 2015-08-26 13:37:18.729831+00:00 >>> t.strftime("%F %T %Z%z") '2015-08-26 13:37:18 UTC+00:00+0000' The reason for such an odd result is that the "name" of timezone.utc was defined as >>> t.tzname() 'UTC+00:00' I don't think there was any deep reason for this choice. We simply have one common rule for forming the names of all fixed offset timezones that don't have a name supplied in the constructor: >>> print(timezone(-5*HOUR)) UTC-05:00 For technical reasons, we cannot give timezone.utc a name, but I think we can change the rules for forming fixed offset timezone names so that zeros are not printed and instead of >>> print(timezone(0*HOUR)) UTC+00:00 we have >>> print(timezone(0*HOUR)) UTC [1]: http://bugs.python.org/issue22241 -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Aug 26 18:30:42 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 26 Aug 2015 09:30:42 -0700 Subject: [Datetime-SIG] The name of the UTC timezone In-Reply-To: References: Message-ID: Why is this needed? When would a real app use %Z%z? On Wed, Aug 26, 2015 at 7:01 AM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > As a little diversion from the PEP wars, let me ask the group if we can > fix this little wart: [1] > > >>> from datetime import * > >>> t = datetime.now(timezone.utc) > >>> print(t) > 2015-08-26 13:37:18.729831+00:00 > >>> t.strftime("%F %T %Z%z") > '2015-08-26 13:37:18 UTC+00:00+0000' > > The reason for such an odd result is that the "name" of timezone.utc was > defined as > > >>> t.tzname() > 'UTC+00:00' > > I don't think there was any deep reason for this choice. We simply have > one common rule for forming the names of all fixed offset timezones that > don't have a name supplied in the constructor: > > >>> print(timezone(-5*HOUR)) > UTC-05:00 > > For technical reasons, we cannot give timezone.utc a name, but I think we > can change the rules for forming fixed offset timezone names so that zeros > are not printed and instead of > > >>> print(timezone(0*HOUR)) > UTC+00:00 > > we have > > >>> print(timezone(0*HOUR)) > UTC > > > [1]: http://bugs.python.org/issue22241 > > _______________________________________________ > Datetime-SIG mailing list > Datetime-SIG at python.org > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Aug 26 18:38:40 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 26 Aug 2015 12:38:40 -0400 Subject: [Datetime-SIG] The name of the UTC timezone In-Reply-To: References: Message-ID: On Wed, Aug 26, 2015 at 12:30 PM, Guido van Rossum wrote: > Why is this needed? When would a real app use %Z%z? Given that a bug report [1] exists, someone was tripped by the current behavior. It is my understanding that "UTC" is one of the few timezone abbreviations that we recognize in strptime and the users expect strptime to be able to parse whatever strftime produces. [1]: http://bugs.python.org/issue22241 -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Aug 26 19:05:10 2015 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 26 Aug 2015 12:05:10 -0500 Subject: [Datetime-SIG] The name of the UTC timezone In-Reply-To: References: Message-ID: [ Guido] > Why is this needed? When would a real app use %Z%z? [in strftime] I've seen it often, and I'm not even a timezone wonk ;-) "The problem" it solves is for programmers who deal with lots of timezones. The name alone is ambiguous in some cases, and so is the offset alone. The offset alone is enough to convert that single instance to UTC, but that's not the only thing someone seeing the output may want to know (else nobody would ever display the name). From guido at python.org Wed Aug 26 19:19:04 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 26 Aug 2015 10:19:04 -0700 Subject: [Datetime-SIG] The name of the UTC timezone In-Reply-To: References: Message-ID: OK, I guess we can change stdlib datetime.timezone.utc's str() to 'UTC'. Make sure to update both the C and the Python version, and the tests, and the docs. It's going to have to go into 3.6. On Wed, Aug 26, 2015 at 9:38 AM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > On Wed, Aug 26, 2015 at 12:30 PM, Guido van Rossum > wrote: > >> Why is this needed? When would a real app use %Z%z? > > > Given that a bug report [1] exists, someone was tripped by the current > behavior. It is my understanding that "UTC" is one of the few timezone > abbreviations that we recognize in strptime and the users expect strptime > to be able to parse whatever strftime produces. > > > [1]: http://bugs.python.org/issue22241 > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From 4kir4.1i at gmail.com Wed Aug 26 20:45:33 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Wed, 26 Aug 2015 21:45:33 +0300 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: (Tim Peters's message of "Wed, 26 Aug 2015 03:16:42 -0500") References: Message-ID: <87zj1d7usy.fsf@gmail.com> Tim Peters writes: > [Tim] >>> ... >>> Later you seem to say you'd prefer a 3-state flag instead, so not sure >>> you really mean "boolean" here. > > [Stuart Bishop ] >> I write Python and SQL for a living. Booleans are 3 state to me ;) > > Got it! Python is sooooooo behind the times :-) > > >> In this case, I'm not fussed if the datetime instance has a 2 state or >> 3 state flag. This is different to the various constructors which I >> think need a 3 state flag in their arguments. True, False, None. > > As things seem to have progressed later, mapping pytz's explicit time > checking into a more magical scheme sprayed all over the datetime > internals is not straightforward, So, as I concluded elsewhere, that > may or may not be done someday, but it's out of scope for PEP 495. > I'm a fan of making progress ("now is better than never", where the > latter was PEP 431's fate waiting for perfection on all counts). > Even if datetime's or replace()'s *first* parameter would be 3-state None|True|False; the internal flag can still be 2-state True|False. first=None could cause a tzinfo callback (it implies that tzinfo must not be None in this case) that sets *first* to True|False appropriately. ... >> Do systems that rely on classic behavior actually exist? > > Of course. A more-or-less subtle example appears later. But we > already mentioned dead-obvious uses: things like "same time tomorrow" > and "same time two weeks from now" are common as mud, and classic > arithmetic implements them fine. So do functions building on those > primitives to implement more sophisticated calendar operations. You > might complain that naive time "same time tomorrow" makes no sense if > someone is starting from 24 hours before what turns out to be a gap > due to DST starting, but few in the real world schedule things at such > times (e.g., DST transitions never occur during normal "business work > hours" if, e.g;, some app postpones a business meeting a week, it's > not credible that they'll ever end up in a gap by adding > timedelta(weeks=1) - unless they're trying to account for leap seconds > too, and the Earth's rotation speeds up "a lot", and "same time next > week" ends up exactly in the missing second). > There is a $5 wifi button that can be used to track baby data. Python helps at various stages: https://medium.com/@edwardbenson/how-i-hacked-amazon-s-5-wifi-button-to-track-baby-data-794214b0bdd8 Babies can poop at night and during DST transitions too. Sleep-deprived parents should be able to see the tracking data in local time in addition to UTC (doing timezone conversions is computer's job). On the internet, people may cooperate while being in different time zones i.e., even "business" software might have to work during DST transitions. MMORPGs are probably also not limited to a single time zone. Non-pytz timezones make mistake on the order of an hour regularly. It is *three orders of magnitude larger* than a second. It is a different class of errors. The code that can't handle ~1s errors over short period of time should use time.monotonic() anyway. ... >> It requires someone to have explicitly chosen to use daylight savings >> capable timezones, without using pytz, while at the same time relying on >> classic's surprising arithmetic. Maybe systems using dateutils without >> using dateutils' implementation of datetime arithmetic. > > ? dateutil doesn't implement arithmetic that I know of, apart from > "relative deltas". It inherits Python's classic arithmetic for > datetime - datetime, and datetime +/- timedelta, AFAICT. dateutil doesn't work during DST transitions but PEP 495 might allow to fix it. As I understand, outside of DST transitions if dates are unique valid local times; dateutil uses "same time tomorrow": (d_with_dateutil_tzinfo + DAY == d.tzinfo.localize(d.replace(tzinfo=None) + DAY, is_dst=None)) while pytz uses "+24 hours": dt_add(d_with_dateutil_tzinfo, DAY) == d + DAY where dt_add() is defined below. The equility works but (d + DAY) may have a wrong tzinfo object if the arithmetic crosses DST boundaries (but it has correct timestamp/utc time anyway). d.tzinfo.normalize(d + DAY) should be used to get the correct tzinfo e.g. for displaying the result. Both types of operations should be supported. >... > def dt_add(dt, td): > return dt.tzinfo.fromutc(dt + (td - dt.utcoffset())) > >... > Note: my dt_add 1-liner may fail in cases starting or landing on a > "problem time" (fold/gap). I've never cared, because DST transitions > are intentionally scheduled to occur "wee hours on a weekend", i.e. > when few people are both awake and sober enough _to_ care. But, after > 495 tzinfos are available, the dt_add 1-liner will always work > correctly. That this implementation of timeline arithmetic _can_ > screw up now has nothing to do with its code, it's inherited from the > inability of pure conversion to always work right now. > Such choices should make an application developer, not a library/language developer. library/language should avoid silent errors as much as possible. >> I think this is a bug worth fixing rather than entrenching, before >> adding any dst aware tzinfo implementations to stdlib (including >> 'local'). > > datetime was released a dozen years ago. There's nothing it does that > wasn't already thoroughly entrenched a decade ago. pytz is widely used. datetime objects with dateutil and pytz tzinfo behave differently as shown above. There are no non-fixed tzinfos in stdlib. dst-tzinfo in stdlib could adopt either pytz or dateutil behavior. If dateutil can be fixed to work correctly using the disambiguation flag then its behavior is preferable because it eliminates localize, normalize calls except localize() could be useful in __new__ if first parameter is None to raise an exception for invalid input otherwise it is equivalent to the default *first* value. >> ... >> However... this also means the new flag on the datetime instances is >> largely irrelevant to pytz. pytz' API will need to remain the same. > > My hope was that 495 alone would at least spare pytz's users from > needing to do a `.normalize()` dance after `.astimezone()` anymore. > Although I'm not clear on why it's needed even now. > As far as I know, normalize() is not necessary after astimezone() even now https://answers.launchpad.net/pytz/+question/249229 >> ... >> For pytz users, being able to write a function do tell if the data you >> were given is broken is a step backwards. When constructing a datetime >> instance with pytz, users have the choice of raising exceptions or >> having pytz normalize the input. They are never given broken data (by >> their definition), and there is no need to weed it out. > > Assuming they follow all "the rules", yes? For example, if they > forget to use .localize(), etc, it seems like anything could happen. > What if they use .replace()?: .combine()? Unpickle a datetime > representing a missing time? Etc. I don't see that pytz has anything > magical to check datetimes created by those. If people forget localize() then tzinfo is not attached and an exception is raised later. It is like mixing bytes and Unicode: if you forget decode() then an exception is raised later. replace() is just a shortcut for a constructor. combine() returns naive objects. You can unpickle non-normalized datetime. replace(first=None) may force normalization. Your program may avoid producing non-normalized values. You can choose to save utc+tzid (to use whatever tzdata is available) or utc+tzid+tzdata-version (if you need the same local time) to restore it later. >> ... >> I think all functions that can create datetime instances will need the >> new optional flag and the flag should be tri-state, defaulting to not >> whine. > > See the "PEP-495 - Strict Invalid Time Checking" thread for more. > There seems to be increasing "feature creep" here. Rewriting vast > swaths of datetime internals to cater to this is at best impractical, > especially compared to supplying a "check this datetime" function > users who care can call when they care. Nevertheless, it's a suitable > subject for a different PEP. I don't want to bog 495 down with it. > If it had _stopped_ with asking for an optional check in the datetime > constructor, it may have been implemented already ;-) It may be a subject for another PEP but here's a possible implementation: class datetime: def __new__(...): if first is None and hasattr(tzinfo, 'localize'): self = tzinfo.localize(naive, is_dst=None) # may raise InvalidTime note: self.first is never None i.e., utcoffset(), tzname(), dst() etc always see either first=True or first=False. From guido at python.org Wed Aug 26 21:00:59 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 26 Aug 2015 12:00:59 -0700 Subject: [Datetime-SIG] DST explained visually Message-ID: I've drawn a simple diagram showing the relationship between UTC and local time throughout a DST cycle: https://www.dropbox.com/s/ptx58d9zkd7m4vj/2015-08-26%2010.28.38.jpg?dl=0 tl;dr: Timeline arithmetic moves along the X axis (UTC); classic arithmetic moves along the Y axis (local time). Discussion: On the X axis I've drawn UTC. On the Y axis I've drawn local time. Ignoring most of the interesting bits, we observe that the mapping from UTC to local time is a function -- for each UTC value there is exactly one correct answer to the question "what is the local time". If we didn't have DST, the plot would be a single straight line at an angle of 45 degrees, i.e. the identity function (y = x). Regardless of the exact function, mapping from local time back to UTC is done by looking up the local time on the Y axis, going right parallel to the X axis until you hit a point on the graph, and then going down to the X axis. What you find there will be UTC. The interesting (and confusing) part is that with DST (or other changes that move local time back) *the inverse is no longer a function*. Let's look at DST in detail. There are four interesting points along the X (UTC) axis: A, B, C, D. I've also drawn four interesting points along the Y (local time) axis: P, Q, R, S. For simplicity, I am assuming DST moves the clock forward by one hour. However, as the diagram is not labeled in absolute units, the individual transitions can also be used to discuss other transitions (e.g. the areas around A could be used to discuss a non-DST-related timezone adjustment, or C-D could represent a leap second). Points along the UTC (X) axis: (A) Start of DST; local clock is move one hour forward. This moves local time from P to Q. (B) One hour before the end of DST. This is the beginning of a period in UTC where mapping to local time and back can cause ambiguities. Note that the period A-B is in reality several months (e.g. from some time in March or April till some time in October or November, in the northern hemisphere). (C) End of DST; local clock is moved one hour back, moving local time from S to R (!). (D) One hour after the end of DST. This is the end of the UTC period where ambiguous local times matter. Points along the local time (Y) axis: (P) Local standard time when DST starts; local clock moves one hour forward, from P to Q in an instant. (Q) Local daylight saving time at the start of DST. (R) Start of ambiguous local time. (S) End of ambiguous local time. The first time we hit this point we move the local clock back by one hour (to R). The second time we hit this point we do nothing. The proposal in PEP 495 adds a 'fold' flag whose value is 0 *except* for local times mapped from UTC period C-D; between C and D local time is between R and S with fold=1. (Note that the current text of the PEP has a flag named 'first' whose definition is the opposite; but the plan is to switch to fold=0. In any case it's one bit of information and it's only used for times between P-Q.) It may not be obvious from the diagram, but by taking the fold flag and the rest of the local time together, the mapping from local time (augmented with the fold flag) to UTC is once again a function, at least in the domain R-S (the fold). In the gap (P-Q) the inverse has no value; but PEP 495 extends the meaning of the fold flag to assign a meaning here too, by mapping P-Q with fold=0 to the extension (up and to the right) of the plot of standard time, and P-Q with fold=1 to the extension (down and to the left) of the plot of DST. (The diagram doesn't label the relevant points but there are dotted lines representing these extensions). My final observation is that the distinction between classic and timeline arithmetic is simply the distinction between the two axes of the plot: timeline==X, classic==Y. Both are useful in different contexts. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Aug 26 21:37:04 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 26 Aug 2015 15:37:04 -0400 Subject: [Datetime-SIG] DST explained visually In-Reply-To: References: Message-ID: On Wed, Aug 26, 2015 at 3:00 PM, Guido van Rossum wrote: > The proposal in PEP 495 adds a 'fold' flag whose value is 0 *except* for > local times mapped from UTC period C-D; between C and D local time is > between R and S with fold=1. (Note that the current text of the PEP has a > flag named 'first' whose definition is the opposite; but the plan is to > switch to fold=0. In any case it's one bit of information and it's only > used for times between P-Q.) I have edited [1] your sketch to show the UTC mappings of two local times: g in the gap and f in the fold: (g, fold=0) maps to G0, (g, fold=1) maps to G1, (f, fold=0) maps to F0, and (f, fold=1) maps to F1. Note that G1 < G0 while F1 > F0. This may look arbitrary, but it follows from a consistent rule: fold=0 is the intersection with the line that is solid (valid) before the transition and fold=1 is the intersection with the line that is solid (valid) after the transition. [1]: https://github.com/abalkin/ltdf/blob/master/dst-visual.jpg -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Aug 26 22:06:21 2015 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 26 Aug 2015 15:06:21 -0500 Subject: [Datetime-SIG] DST explained visually In-Reply-To: References: Message-ID: [Guido] > I've drawn a simple diagram showing the relationship between UTC and local > time throughout a DST cycle: > > https://www.dropbox.com/s/ptx58d9zkd7m4vj/2015-08-26%2010.28.38.jpg?dl=0 Cool! But in the spirit of mailing lists, I want to complain about the asymmetry of the labels: A, B, C, D are the first four letters of the English alphabet, so for symmetry it's just plain broken that you didn't use W, X, Y, Z for the other labels ;-) Another thing to note: as Isaac observed, while UTC->local is a function, that's not as exploitable as one might hope, because it's not a continuous function. However, the diagram as a whole shows a collection of 3 piecewise continuous bijections (each solid diagonal line segment is a one-to-one continuous function "in both directions" - and also monotonic). That's highly exploitable. Indeed, for times through 2037, tzfiles explicitly store all points akin to A and C (the UTC points bounding the piecewise continuous bijections) in a sorted list. It's a minor annoyance that there are an infinite number of such points ;-) From alexander.belopolsky at gmail.com Wed Aug 26 22:06:50 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 26 Aug 2015 16:06:50 -0400 Subject: [Datetime-SIG] DST explained visually In-Reply-To: References: Message-ID: On Wed, Aug 26, 2015 at 3:00 PM, Guido van Rossum wrote: > > Note that the current text of the PEP has a flag named 'first' whose definition is the opposite; but the plan is to switch to fold=0. I decided to establish the origin of the term "fold" before editing the PEP. The credit for introducing the term "fall-backward fold" in computing goes to Paul Eggert of UCLA who used it in various discussions related to the C language standard that culminated in a defect report #139 [1]. However, apparently, the idea goes back to 1917 Germany. As Paul Eggert explained in private correspondence, """ fold=0 and fold=1 is like the longstanding German standard for expressing times as strings, which uses "A" and "B" to distinguish time stamps that would otherwise be ambiguous. If I understand things correctly, normally fold=0, but you can have fold=1 when time stamps are repeated. For more about the German standard, you can start with the Wikipedia description here: [2] """ [1]: http://www.open-std.org/jtc1/sc22/wg14/docs/rr/dr_136.html [2]: https://de.wikipedia.org/wiki/Sommerzeit#Offizielle_Regelung_der_Zeitumstellung -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Wed Aug 26 22:09:58 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 26 Aug 2015 13:09:58 -0700 Subject: [Datetime-SIG] DST explained visually In-Reply-To: References: Message-ID: On Wed, Aug 26, 2015 at 12:37 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > On Wed, Aug 26, 2015 at 3:00 PM, Guido van Rossum > wrote: > >> The proposal in PEP 495 adds a 'fold' flag whose value is 0 *except* for >> local times mapped from UTC period C-D; between C and D local time is >> between R and S with fold=1. (Note that the current text of the PEP has a >> flag named 'first' whose definition is the opposite; but the plan is to >> switch to fold=0. In any case it's one bit of information and it's only >> used for times between P-Q.) > > > I have edited [1] your sketch to show the UTC mappings of two local times: > g in the gap and f in the fold: (g, fold=0) maps to G0, (g, fold=1) maps to > G1, (f, fold=0) maps to F0, and (f, fold=1) maps to F1. Note that G1 < G0 > while F1 > F0. This may look arbitrary, but it follows from a consistent > rule: fold=0 is the intersection with the line that is solid (valid) before > the transition and fold=1 is the intersection with the line that is solid > (valid) after the transition. > > [1]: https://github.com/abalkin/ltdf/blob/master/dst-visual.jpg > Cool! Which reminds me, there are some edge cases to consider. What's the local time for UTC=A? And for UTC=C? I guess the rule is to use half-open intervals on the X axis that are open on the right, so that A maps to Q and C maps to R. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Aug 26 22:12:11 2015 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 26 Aug 2015 15:12:11 -0500 Subject: [Datetime-SIG] DST explained visually In-Reply-To: References: Message-ID: [Alexander] > I have edited [1] your sketch to show the UTC mappings of two local times: g > in the gap and f in the fold: (g, fold=0) maps to G0, (g, fold=1) maps to > G1, (f, fold=0) maps to F0, and (f, fold=1) maps to F1. Note that G1 < G0 > while F1 > F0. This may look arbitrary, but it follows from a consistent > rule: fold=0 is the intersection with the line that is solid (valid) before > the transition and fold=1 is the intersection with the line that is solid > (valid) after the transition. > > [1]: https://github.com/abalkin/ltdf/blob/master/dst-visual.jpg So we're switching to a boolean flag again, named solidbefore? The great advantage to that would be easy of finding via Google. While there a hundred times more hits on "python which" than on "python fold", there are still over 800,000 hits on the latter. There only 7 on python "solidbefore" although you need the quotes to cut it back from some thousands ;-) From guido at python.org Wed Aug 26 22:21:51 2015 From: guido at python.org (Guido van Rossum) Date: Wed, 26 Aug 2015 13:21:51 -0700 Subject: [Datetime-SIG] DST explained visually In-Reply-To: References: Message-ID: Here's a cleaned-up scan of my diagram: https://www.dropbox.com/s/q59g183ozahkk5n/DSTplot.pdf?dl=0 -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Wed Aug 26 22:22:14 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 26 Aug 2015 16:22:14 -0400 Subject: [Datetime-SIG] DST explained visually In-Reply-To: References: Message-ID: On Wed, Aug 26, 2015 at 4:12 PM, Tim Peters wrote: > So we're switching to a boolean flag again, named solidbefore? Nah, now that we know the history nothing will beat "sommerzeitkaputt"! -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Wed Aug 26 22:28:11 2015 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 26 Aug 2015 15:28:11 -0500 Subject: [Datetime-SIG] DST explained visually In-Reply-To: References: Message-ID: [Guido] > Which reminds me, there are some edge cases to consider. What's the > local time for UTC=A? And for UTC=C? I guess the rule is to use half-open > intervals on the X axis that are open on the right, so that A maps to Q and > C maps to R. This is clearer ;-) using the hyperreal number line: https://en.wikipedia.org/wiki/Hyperreal_number Transitions theoretically happen at A-h and C-h, where h is any infinitesimal > 0 (any hyperreal number strictly greater than 0 and strictly less than any real number). So your intuition is right. You may ask "but what are the local times corresponding to A-h and C-h?". That would just be making trouble for no good reason. There are far too few points on a real number line to display A-h or C-h, so "who cares?" is appropriate for _many_ reasons ;-) From tim.peters at gmail.com Wed Aug 26 22:31:20 2015 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 26 Aug 2015 15:31:20 -0500 Subject: [Datetime-SIG] DST explained visually In-Reply-To: References: Message-ID: [Tim] >> So we're switching to a boolean flag again, named solidbefore? [Alex] > Nah, now that we know the history nothing will beat "sommerzeitkaputt"! Bingo! No Google hits at all on "Python sommerzeitkaputt": Although Google is so annoyed by that it says "Showing results for python sommerzeit kaputt" instead, and then we're back to tens of thousands of hits. But good enough. sommerzeitkaputt has such a nice ring to it :-) From ischwabacher at wisc.edu Wed Aug 26 23:04:45 2015 From: ischwabacher at wisc.edu (ISAAC J SCHWABACHER) Date: Wed, 26 Aug 2015 21:04:45 +0000 Subject: [Datetime-SIG] DST explained visually In-Reply-To: References: Message-ID: > However, apparently, the idea goes back to 1917 Germany. ? As Paul Eggert explained in private correspondence, > > """ > fold=0 and fold=1 is like the longstanding German standard for expressing times as strings, which uses "A" and "B" to distinguish time stamps that would otherwise be ambiguous.? If I understand things correctly, normally fold=0, but you can have fold=1 when time stamps are repeated.? For more about the German standard, you can start with the Wikipedia description here: [2] > """ > > [2]: https://de.wikipedia.org/wiki/Sommerzeit#Offizielle_Regelung_der_Zeitumstellung My favorite part of that Wikipedia article is this sentence: "Sie darf nicht verwechselt werden mit der international ?blichen Abk?rzung der Zeitzonen, nach der 2 Uhr MESZ als 0200B und 2 Uhr MEZ als 0200A bezeichnet werden." Which is to say (IIUC) that the international standard is that the first occurrence is B and the second is A, exactly the opposite of the German system. O Weh! ijs From alexander.belopolsky at gmail.com Wed Aug 26 23:21:37 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 26 Aug 2015 17:21:37 -0400 Subject: [Datetime-SIG] DST explained visually In-Reply-To: References: Message-ID: On Wed, Aug 26, 2015 at 5:04 PM, ISAAC J SCHWABACHER wrote: > > [2]: > https://de.wikipedia.org/wiki/Sommerzeit#Offizielle_Regelung_der_Zeitumstellung > > My favorite part of that Wikipedia article is this sentence: > > "Sie darf nicht verwechselt werden mit der international ?blichen > Abk?rzung der Zeitzonen, nach der 2 Uhr MESZ als 0200B und 2 Uhr MEZ als > 0200A bezeichnet werden." > > Which is to say (IIUC) that the international standard is that the first > occurrence is B and the second is A, exactly the opposite of the German > system. O Weh! > I think they just caution you not to confuse the "A" and "B" times with the (military) letter zones A through Z. The fold disambiguation seems to be in the natural A first, then B order. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ischwabacher at wisc.edu Wed Aug 26 23:35:28 2015 From: ischwabacher at wisc.edu (ISAAC J SCHWABACHER) Date: Wed, 26 Aug 2015 21:35:28 +0000 Subject: [Datetime-SIG] DST explained visually In-Reply-To: References: Message-ID: Oh, yes I completely misinterpreted that sentence. Derf. I was led astray by the coincidence that CEST is +0200 and CET +0100. ijs ________________________________ From: Alexander Belopolsky Sent: Wednesday, August 26, 2015 16:21 To: ISAAC J SCHWABACHER Cc: Guido van Rossum; datetime-sig Subject: Re: [Datetime-SIG] DST explained visually On Wed, Aug 26, 2015 at 5:04 PM, ISAAC J SCHWABACHER > wrote: > [2]: https://de.wikipedia.org/wiki/Sommerzeit#Offizielle_Regelung_der_Zeitumstellung My favorite part of that Wikipedia article is this sentence: "Sie darf nicht verwechselt werden mit der international ?blichen Abk?rzung der Zeitzonen, nach der 2 Uhr MESZ als 0200B und 2 Uhr MEZ als 0200A bezeichnet werden." Which is to say (IIUC) that the international standard is that the first occurrence is B and the second is A, exactly the opposite of the German system. O Weh! I think they just caution you not to confuse the "A" and "B" times with the (military) letter zones A through Z. The fold disambiguation seems to be in the natural A first, then B order. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jcronin at egh.com Wed Aug 26 22:54:02 2015 From: jcronin at egh.com (jon) Date: Wed, 26 Aug 2015 16:54:02 -0400 Subject: [Datetime-SIG] Slightly better name for fold? In-Reply-To: References: Message-ID: ?fold? -> ?folded?? ?fold? strikes me as imperative, rather than descriptive. Jonathan Jonathan Cronin jcronin at egh.com 781 861 0670 EGH, Inc. 55 Waltham Street Lexington, MA 02421 From alexander.belopolsky at gmail.com Thu Aug 27 00:02:53 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 26 Aug 2015 18:02:53 -0400 Subject: [Datetime-SIG] Slightly better name for fold? In-Reply-To: References: Message-ID: On Wed, Aug 26, 2015 at 4:54 PM, jon wrote: > ?fold? -> ?folded?? > > ?fold? strikes me as imperative, rather than descriptive. > Note that "fold" is a noun, not a verb in the context of PEP 495. Think of a bifold wallet as consisting of two "folds" rather than a piece of leather folded once. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Thu Aug 27 00:13:18 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 26 Aug 2015 18:13:18 -0400 Subject: [Datetime-SIG] Slightly better name for fold? In-Reply-To: References: Message-ID: On Wed, Aug 26, 2015 at 6:02 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > On Wed, Aug 26, 2015 at 4:54 PM, jon wrote: > >> ?fold? -> ?folded?? >> >> ?fold? strikes me as imperative, rather than descriptive. >> > > Note that "fold" is a noun, not a verb in the context of PEP 495. Think > of a bifold wallet as consisting of two "folds" rather than a piece of > leather folded once. Using a noun for the name of an attribute is in-line with the current naming scheme: time( hour=1, minute=30, second=0, fold=1) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Thu Aug 27 03:37:18 2015 From: tim.peters at gmail.com (Tim Peters) Date: Wed, 26 Aug 2015 20:37:18 -0500 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: <87zj1d7usy.fsf@gmail.com> References: <87zj1d7usy.fsf@gmail.com> Message-ID: [Akira Li <4kir4.1i at gmail.com>] > ... > Even if datetime's or replace()'s *first* parameter would be 3-state > None|True|False; the internal flag can still be 2-state True|False. > > first=None could cause a tzinfo callback (it implies that tzinfo must > not be None in this case) that sets *first* to True|False appropriately. Well, you have your ideas on this, and others have theirs. This isn't going to make progress until the people who want it get together and agree among themselves first on a single, unified, comprehensive proposal. So please take this to the following thread instead (but after reading all of it first ;-) ): PEP-495 - Strict Invalid Time Checking > ... > There is a $5 wifi button that can be used to track baby data. > Python helps at various stages: > https://medium.com/@edwardbenson/how-i-hacked-amazon-s-5-wifi-button-to-track-baby-data-794214b0bdd8 > > Babies can poop at night and during DST transitions too. Sleep-deprived > parents should be able to see the tracking data in local time in > addition to UTC (doing timezone conversions is computer's job). Nobody has said some apps don't need reliable conversions (to the contrary, that's the primary _point_ of PEP 495). Nobody has said some apps don't need timeline arithmetic - although I have said it's poor practice to even _try_ to do timeline arithmetic if an app isn't working in UTC or with naive datetimes. If an app is following best practice (UTC or naive datetimes), then timeline arithmetic is what they _always_ get (it's the same thing as classic arithmetic in those contexts). > On the internet, people may cooperate while being in different time > zones i.e., even "business" software might have to work during DST > transitions. MMORPGs are probably also not limited to a single time > zone. Ditto. > Non-pytz timezones make mistake on the order of an hour regularly. > It is *three orders of magnitude larger* than a second. It is a different > class of errors. The code that can't handle ~1s errors over short period > of time should use time.monotonic() anyway. Apps that care about leap seconds _should_ be using TAI. Apps that want timeline arithmetic _should_ be using UTC. Unfortunately, people shoot themselves in the feet all the time. Python can't stop that. But it doesn't have to _cater_ to poor practices either. > ... > dateutil doesn't work during DST transitions but PEP 495 might allow to > fix it. I don't know what "doesn't work" means, precisely. There are certain behaviors that do and don't work as you might hope. For example, even the stupidest possible tzinfo implementation that follows the docs today has no problem converting from UTC to local time across DST transitions - the default .fromutc() was designed to ensure that conversion in _that_ direction mimics the local clock in all cases (including skipping a local hour at DST start, and repeating a local hour at DST end - where "hour" really means "whole number of minutes"). What's impossible now (outside of pytz) is converting ambiguous local times _back_ to UTC in all cases. PEP 495 will repair that - that's its primary point. There's no "might" about it. But, for that to be of use to dateutil users, dateutil will need to change its tzinfo implementation to meet 495's new tzinfo requirements. > As I understand, outside of DST transitions if dates are unique valid > local times; dateutil uses "same time tomorrow": > > (d_with_dateutil_tzinfo + DAY == > d.tzinfo.localize(d.replace(tzinfo=None) + DAY, is_dst=None)) > > while pytz uses "+24 hours": > > dt_add(d_with_dateutil_tzinfo, DAY) == d + DAY > > where dt_add() is defined below. The equility works but (d + DAY) may > have a wrong tzinfo object if the arithmetic crosses DST boundaries (but > it has correct timestamp/utc time anyway). d.tzinfo.normalize(d + DAY) > should be used to get the correct tzinfo e.g. for displaying the result. > > Both types of operations should be supported. If you're saying that classic and timeline arithmetic both have legitimate uses, sure. Nobody has said otherwise. If you're trying to say more than just that, sorry, I missed the point. As to "supported", there are _degrees_ of support, and Python very obviously favors classic arithmetic. That can't change. I personally have no interest in providing more support for timeline arithmetic _beyond_ getting PEP 495 implemented so that error-free timeline arithmetic _can_ be implemented easily. At that point, my interest ends. I believe I've already been very clear that it's fine by me if the only further support Python supplies is to add some one-line Python functions to the docs implementing the 3 flavors of timeline arithmetic (datetime-datetime and datetime +/- timedelta) - but near the end of a new section explaining that working with UTC datetimes instead is far better practice fur timeline arithmetic use cases. > ... > pytz is widely used. datetime objects with dateutil and pytz tzinfo > behave differently as shown above. > > There are no non-fixed tzinfos in stdlib. dst-tzinfo in stdlib could > adopt either pytz or dateutil behavior. I don't know whether Stuart mucked with arithmetic because he believed that was necessary in order to get conversions to work correctly (if so, he was mistaken), or whether the effects on arithmetic were just a _consequence_ of using fixed-offset classes all the time (that's "a natural" outcome of using only fixed-offset classes - it would take extra effort to _stop_ it - classic and timeline arithmetic are the same thing in any eternally-fixed-offset timezone) . He said, in an earlier message, that conversion was his primary concern. But maybe we're all using the same words with different meanings. In any case, conversions are my - and PEP 495's - only real concern. Because timeline arithmetic is inappropriate for datetime's "naive time" model, is incompatible with what Python has been doing for a dozen years already, is far slower than classic arithmetic, and because people who need timeline arithmetic "shouldn't be" using non-UTC aware-datetimes at all for arithmetic, I don't see any chance of pytz's behaviors being adopted in all respects by Python. Nor dateutil's. That one can't always do conversions correctly today. After PEP 495 is implemented, whoever steps up to supply a wrapping of the Olson database with 495-compliant tzinfos will probably get rubber-stamp approval to fold it into the core. I'd also like to see dateutil's wrappings of timezones obtained from VTIMEZONE files, POSIX-TZ strings, and the Microsoft registry folded in. Not all apps _can_ use zoneinfo. zoneinfo is by far the most important, though. I prioritize. That's something mailing lists are incapable of, which is why no mailing list has ever released any software ;-) > If dateutil can be fixed to work correctly using the disambiguation flag > then its behavior is preferable because it eliminates localize, > normalize calls Then you get classic arithmetic. Which is not only fine by me, I believe it's the only realistic outcome for the reasons explained just above. > except localize() could be useful in __new__ if first > parameter is None to raise an exception for invalid input otherwise it > is equivalent to the default *first* value. That one is for the "PEP-495 - Strict Invalid Time Checking" thread. >> ... >> My hope was that 495 alone would at least spare pytz's users from >> needing to do a `.normalize()` dance after `.astimezone()` anymore. >> Although I'm not clear on why it's needed even now. > As far as I know, normalize() is not necessary after astimezone() even > now > https://answers.launchpad.net/pytz/+question/249229 That agrees with my best guess, but my knowledge of pytz is shallow. If it's correct that the .normalize() dance isn't needed here, it would be nice if Stuart plainly said so on that page, and - of course - changed the docs to stop saying it _is_ required. And then it's also the case that I don't see any benefit to pytz from PEP 495 alone. :-( >>> For pytz users, being able to write a function do tell if the data you >>> were given is broken is a step backwards. When constructing a datetime >>> instance with pytz, users have the choice of raising exceptions or >>> having pytz normalize the input. They are never given broken data (by >>> their definition), and there is no need to weed it out. >> Assuming they follow all "the rules", yes? For example, if they >> forget to use .localize(), etc, it seems like anything could happen. >> What if they use .replace()?: .combine()? Unpickle a datetime >> representing a missing time? Etc. I don't see that pytz has anything >> magical to check datetimes created by those. > If people forget localize() then tzinfo is not attached and an exception > is raised later. It is like mixing bytes and Unicode: if you forget > decode() then an exception is raised later. AFAICT, pytz can't enforce anything. You don't _need_ to call localize() to get _a_ datetime. From scanning message boards, e.g., I see it's a common mistake for new pytz users to use datetime.datetime(..., tzinfo=...;) directly, not using localize() at all, despite the very clear instructions in the docs that they must _not_ do that. That can be a real problem for modules fighting basic design warts: newcomers are lost at first, and even experts can have trouble inter-operating with code _outside_ what typically becomes an increasingly self-contained world (e.g., Isaac cheerfully complained earlier about his pains trying to get pytz and dateutil to work together). > replace() is just a shortcut for a constructor. Yet pytz does nothing to check .replace() results, right? > combine() returns naive objects. Not always true. Plain `time` objects can have a tzinfo of their own. Pass one of those to .combine(), and you get an aware datetime. And it's generally impossible to check a `time` on its own for fold/gap - you generally need a date too to have any chance of determining that. Anyway, that - and the rest below - belong in the "PEP-495 - Strict Invalid Time Checking" thread. I'm outta here ;-) > ... From alexander.belopolsky at gmail.com Thu Aug 27 04:35:32 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Wed, 26 Aug 2015 22:35:32 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: <87zj1d7usy.fsf@gmail.com> Message-ID: [Akira Li] > Even if datetime's or replace()'s *first* parameter would be 3-state > None|True|False; the internal flag can still be 2-state True|False [Tim Peters] > So please take this to the following thread instead (but > after reading all of it first ;-) ): > > PEP-495 - Strict Invalid Time Checking Actually a follow-up thread "Strict Invalid Time Checking: an idea for another PEP" may be a better (and shorter!) fit. -------------- next part -------------- An HTML attachment was scrubbed... URL: From 4kir4.1i at gmail.com Thu Aug 27 12:28:07 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Thu, 27 Aug 2015 13:28:07 +0300 Subject: [Datetime-SIG] DST explained visually In-Reply-To: (Tim Peters's message of "Wed, 26 Aug 2015 15:12:11 -0500") References: Message-ID: <87io816n60.fsf@gmail.com> Tim Peters writes: ... > So we're switching to a boolean flag again, named solidbefore? The > great advantage to that would be easy of finding via Google. While > there a hundred times more hits on "python which" than on "python > fold", there are still over 800,000 hits on the latter. There only 7 > on > > python "solidbefore" > > although you need the quotes to cut it back from some thousands ;-) The first hit on google for *python datetime fold* is PEP 495. From stuart at stuartbishop.net Thu Aug 27 13:13:37 2015 From: stuart at stuartbishop.net (Stuart Bishop) Date: Thu, 27 Aug 2015 18:13:37 +0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: <87zj1d7usy.fsf@gmail.com> Message-ID: On 27 August 2015 at 08:37, Tim Peters wrote: > I don't know whether Stuart mucked with arithmetic because he believed > that was necessary in order to get conversions to work correctly (if > so, he was mistaken), or whether the effects on arithmetic were just a > _consequence_ of using fixed-offset classes all the time (that's "a > natural" outcome of using only fixed-offset classes - it would take > extra effort to _stop_ it - classic and timeline arithmetic are the > same thing in any eternally-fixed-offset timezone) . He said, in an > earlier message, that conversion was his primary concern. But maybe > we're all using the same words with different meanings. The fixed-offset classes and sorting arithmetic were the only way to get things round tripping. Take a datetime. Convert it to another timezone. Add one hour to both. Compare. The results were inconsistent. You would only get correct results with fixed offset timezones, because the builtin arithmetic ignored the timezone, because there was no is_dst flag and without it it is impossible to get correct results. The burden was left on tzinfo implementations to deal with the problem. You could have naive times and do arithmetic correctly, or you could have zone aware times and do conversions correctly, but to do both developers had to always convert to and from utc to do the arithmetic. And developers being lazy creatures wouldn't bother because it would normally work, or even always work in their particular timezone, and systems would crash at 4am killing innocent kittens. And this was a problem with my tzinfo implementation, because the only way you could possibly experience the problem was by using my tzinfo implementation. Python had avoided this clearly documented problem by not supplying any tzinfo implementations, even though it would have been easy to create a 'local' one using the information already exposed in the time module, and I'd always assumed that fixing it was a requirement of adding timezone implementations to the standard library. So I fixed it. Drunk on my own cleverness and relative youth, it never occurred to me that it was possible to rationalize the existing behaviour with a straight face, where after going to all the effort of constructing and adding a tzinfo to your datetime it would sit there entirely ignored by Python, except for conversion operations, consistently giving you answers that are demonstrably incorrect using most modern timekeeping systems. I'm still not capable of conjuring up such a monumental rationalization ;) >>> My hope was that 495 alone would at least spare pytz's users from >>> needing to do a `.normalize()` dance after `.astimezone()` anymore. >>> Although I'm not clear on why it's needed even now. > >> As far as I know, normalize() is not necessary after astimezone() even >> now >> https://answers.launchpad.net/pytz/+question/249229 > > That agrees with my best guess, but my knowledge of pytz is shallow. > If it's correct that the .normalize() dance isn't needed here, it > would be nice if Stuart plainly said so on that page, and - of course > - changed the docs to stop saying it _is_ required. And then it's > also the case that I don't see any benefit to pytz from PEP 495 alone. > :-( Yeah, I'm putting off answering that one because I'm not sure if I'll get the answer right. People sometimes think I actually know what I'm doing. I'll have a look after I get the overdue pytz release out. >> replace() is just a shortcut for a constructor. > > Yet pytz does nothing to check .replace() results, right? It has no opportunity to check the results at the moment, nor does it have the opportunity to swap in the correct fixed offset tzinfo. It just gets rudely shoved into a datetime instance without consent. So replace gives you what it gives you, and you have to sort it out with a call to normalize, at which point you realize you have a timezone definition from 1910 in play and learn to always construct your localized datetime instances with tzinfo.localize(). Such a lovely user experience. -- Stuart Bishop http://www.stuartbishop.net/ From 4kir4.1i at gmail.com Thu Aug 27 13:33:13 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Thu, 27 Aug 2015 14:33:13 +0300 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: (Tim Peters's message of "Wed, 26 Aug 2015 20:37:18 -0500") References: <87zj1d7usy.fsf@gmail.com> Message-ID: <87fv356k5i.fsf@gmail.com> Tim Peters writes: > [Akira Li <4kir4.1i at gmail.com>] ... > Nobody has said some apps don't need reliable conversions (to the > contrary, that's the primary _point_ of PEP 495). Nobody has said > some apps don't need timeline arithmetic - although I have said it's > poor practice to even _try_ to do timeline arithmetic if an app isn't > working in UTC or with naive datetimes. If an app is following best > practice (UTC or naive datetimes), then timeline arithmetic is what > they _always_ get (it's the same thing as classic arithmetic in those > contexts). > I agree on the best practices here. I would prefer that __add__ would be forbidden for local timezones unless they have a fixed utc offset. But it might be too late for that now. If __add__ is allowed for timezone-aware datetime objects then arithmetic "as though via conversion to utc time" is *equally valid* as the arithmetic "as though it is a timezone-naive datetime object". >> Non-pytz timezones make mistake on the order of an hour regularly. >> It is *three orders of magnitude larger* than a second. It is a different >> class of errors. The code that can't handle ~1s errors over short period >> of time should use time.monotonic() anyway. > > Apps that care about leap seconds _should_ be using TAI. Apps that > want timeline arithmetic _should_ be using UTC. Unfortunately, people > shoot themselves in the feet all the time. Python can't stop that. > But it doesn't have to _cater_ to poor practices either. > By your logic: Apps that care about timezone-naive arithmetic _should_ be using naive datetime objects. I agree it is a poor practice to perform arithmetic on localized time. But as long as such arithmetic *is* allowed then it *is* ambiguous what type of arithmetic should be used. There is no *one obvious* way here. >> ... >> dateutil doesn't work during DST transitions but PEP 495 might allow to >> fix it. > > I don't know what "doesn't work" means, precisely. There are certain > behaviors that do and don't work as you might hope. For example, even > the stupidest possible tzinfo implementation that follows the docs > today has no problem converting from UTC to local time across DST > transitions - the default .fromutc() was designed to ensure that > conversion in _that_ direction mimics the local clock in all cases > (including skipping a local hour at DST start, and repeating a local > hour at DST end - where "hour" really means "whole number of > minutes"). What's impossible now (outside of pytz) is converting > ambiguous local times _back_ to UTC in all cases. PEP 495 will repair > that - that's its primary point. There's no "might" about it. But, > for that to be of use to dateutil users, dateutil will need to change > its tzinfo implementation to meet 495's new tzinfo requirements. I've linked to a couple of dateutil bugs previously in PEP-431/495 thread [1] I was surprised as you that dateutil .fromutc() appears to be broken. I use "might" because I haven't read dateutil code. I can't be sure e.g., what backward-compatibility concerns might prevent PEP 495 fix its issues with an ambigous local time. Timezones is a very complicated topic -- no solution works in the general case. >> As I understand, outside of DST transitions if dates are unique valid >> local times; dateutil uses "same time tomorrow": >> >> (d_with_dateutil_tzinfo + DAY == >> d.tzinfo.localize(d.replace(tzinfo=None) + DAY, is_dst=None)) >> >> while pytz uses "+24 hours": >> >> dt_add(d_with_dateutil_tzinfo, DAY) == d + DAY >> >> where dt_add() is defined below. The equility works but (d + DAY) may >> have a wrong tzinfo object if the arithmetic crosses DST boundaries (but >> it has correct timestamp/utc time anyway). d.tzinfo.normalize(d + DAY) >> should be used to get the correct tzinfo e.g. for displaying the result. >> >> Both types of operations should be supported. > > If you're saying that classic and timeline arithmetic both have > legitimate uses, sure. Nobody has said otherwise. If you're trying > to say more than just that, sorry, I missed the point. Yes, it is exactly my point. The only my objection that timezone-naive arithmetic is somehow superior for localized times. Though I don't mind it as long as timezone conversions would work. ... >> If dateutil can be fixed to work correctly using the disambiguation flag >> then its behavior is preferable because it eliminates localize, >> normalize calls > > Then you get classic arithmetic. Which is not only fine by me, I > believe it's the only realistic outcome for the reasons explained just > above. The key word here is "If". *If* it works; great. It is still possible to perform both types of arithmetic as the examples above demonstrate. ... >> If people forget localize() then tzinfo is not attached and an exception >> is raised later. It is like mixing bytes and Unicode: if you forget >> decode() then an exception is raised later. > > AFAICT, pytz can't enforce anything. You don't _need_ to call > localize() to get _a_ datetime. From scanning message boards, e.g., I > see it's a common mistake for new pytz users to use > datetime.datetime(..., tzinfo=...;) directly, not using localize() at > all, despite the very clear instructions in the docs that they must > _not_ do that... I had this (incorrect as I've now realized) picture in mind: naive = datetime.strptime(...) aware = tz.localize(naive, is_dst=None) If localize() is forgotten then we would get a naive object that would raise an exception in tzinfo-related methods later. Yes. datetime(..., tzinfo=...) is a common error and I don't see how *without a performance hit* pytz can't fix it with PEP 495 *alone*. But it *can* be fixed if the performance is not a concern (perhaps with a slight change in the pickle semantics if relevant tzdata has changed). > ...That can be a real problem for modules fighting basic > design warts: newcomers are lost at first, and even experts can have > trouble inter-operating with code _outside_ what typically becomes an > increasingly self-contained world (e.g., Isaac cheerfully complained > earlier about his pains trying to get pytz and dateutil to work > together). I agree, the usability is a real issue (for newcomers and experts). But dateutil doesn't work [1] in cases where pytz does work and therefore if people use dateutil now; the correctness is not their primary concern and it should be much easier to combine the two libraries if you don't actually need correct answers. [1] https://mail.python.org/pipermail/datetime-sig/2015-August/000467.html From ethan at stoneleaf.us Thu Aug 27 14:13:22 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 27 Aug 2015 05:13:22 -0700 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> <55DCD499.6020801@oddbird.net> <55DCD84A.5060006@oddbird.net> Message-ID: <55DEFEE2.7030007@stoneleaf.us> On 08/25/2015 02:22 PM, Alexander Belopolsky wrote: > On Tue, Aug 25, 2015 at 5:04 PM, Carl Meyer wrote: >> >> I can't imagine how raising an exception on invalid times only if a >> non-default sentinel value is given for a flag that is _new in PEP 495_ >> could possibly break 4000 lines of existing datetime tests. > > OK, so datetime module itself will never set fold=-1. In the list below, can you mark the methods that need to be patched to check the fold attribute in your preferred design: If `fold=None` is present /when attempting to create a datetime/ that is ambiguous or invalid, an exception is raised /at that moment/ meaning that a datetime with `fold=None` /will never exist/. `fold=None` /will not be the default/. > datetime.__add__ no > datetime.__eq__ no > datetime.__format__ no > datetime.__ge__ no > datetime.__gt__ no > datetime.__hash__ no > datetime.__le__ no > datetime.__lt__ no > datetime.__ne__ no > datetime.__new__ YES > datetime.__radd__ no > datetime.__reduce__ no > datetime.__reduce_ex__ no > datetime.__repr__ no > datetime.__rsub__ no > datetime.__str__ no > datetime.astimezone no > datetime.combine YES (could have a time instance with fold=None) > datetime.ctime no > datetime.date no > datetime.dst no > datetime.fromordinal no > datetime.fromtimestamp no > datetime.isocalendar no > datetime.isoformat no > datetime.isoweekday no > datetime.now no > datetime.replace YES > datetime.strftime no > datetime.strptime only if `fold=` is allowed > datetime.time no > datetime.timestamp no > datetime.timetuple no > datetime.timetz no > datetime.today no > datetime.toordinal no > datetime.tzname no > datetime.utcfromtimestamp no > datetime.utcnow no > datetime.utcoffset no > datetime.utctimetuple no > datetime.weekday no -- ~Ethan~ From stuart at stuartbishop.net Thu Aug 27 15:12:33 2015 From: stuart at stuartbishop.net (Stuart Bishop) Date: Thu, 27 Aug 2015 20:12:33 +0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On 26 August 2015 at 15:16, Tim Peters wrote: >> In this case, I'm not fussed if the datetime instance has a 2 state or >> 3 state flag. This is different to the various constructors which I >> think need a 3 state flag in their arguments. True, False, None. > > As things seem to have progressed later, mapping pytz's explicit time > checking into a more magical scheme sprayed all over the datetime > internals is not straightforward, So, as I concluded elsewhere, that > may or may not be done someday, but it's out of scope for PEP 495. > I'm a fan of making progress ("now is better than never", where the > latter was PEP 431's fate waiting for perfection on all counts). Yup. >> Classic behaviour as you describe it is a bug. > > Believe me, you won't get anywhere with that approach. > > - Classic arithmetic is the only kind that makes good sense in the > "naive time" model, which _is_ datetime's model. [... sensible stuff trimmed ...] > That said, I would have preferred it if Python's datetime had used > classic arithmetic only for naive datetimes. I feared it might be > endlessly confusing if an aware datetime used classic arithmetic too. > I'm not sure about "endlessly" now, but it has come up more than once > ;-) Far too late to change now, though. I'm wondering if it is worth formalizing this (post-PEP-495,or maybe some choice wording changes made in the docs). Would it work if we introduced a new type, datetimetz? We would have a time, with a tzinfo because it might be useful later, a naive time, with a tzinfo because it is useful for rendering and conversions, and a datetimetz with all the complexities and slowdowns of timeline arithmetic. While not changing the behaviour of datetime at all, we could get cats and dogs living together by just clarifying what it actually is. >> If you use pytz tzinfo instances, adding 1 second always adds one second > > But only in the _model_ you have in mind: real-life clocks showing > real-life civil time suffer from leap seconds too. You can laugh that > off in _your_ apps (and I can too ;-) ), but for other apps it's dead > serious. If our underlying platforms that we needed to work with supported it, I'd probably be in favour of leap seconds. I doubt that would ever happen - there are more palatable workarounds. >> and adding 1 day always adds 24 hours. > > That's also true of classic arithmetic. The meanings of "day" and "24 > hours" also depend on the model in use. I think in my view, as soon as you go to the bother of adding a tzinfo instance to the datetime you are making a statement about the expected behaviour; that the simpler classic arithmetic no longer applies and the more complex model needs to be used. > def dt_add(dt, td): > return dt.tzinfo.fromutc(dt + (td - dt.utcoffset())) > > There you go: "timeline" datetime + timedelta arithmetic about as > efficiently as possible in pure Python. Note that _if_ the default > changed to timeline arithmetic, this code would no longer work. The > "+" there requires classic arithmetic to get the right result. Change > the default, this code would break too. I find it hard to imagine I'm > the only person in the world who has code similarly taking advantage > of what Python actually does. > > Example: I see. What I don't like about this approach is the developers need to be aware that they need to call it, and that dt + timedelta(hours=24) may not work. Of course, developers will not be aware or have done more than skim the docs until after their guests have all died of salmonella poisoning from the undercooked Turkey. Its one of the reasons I'm wondering if something more in your face like the datetimetz proposal above would be an improvement. Stop making me hungry dammit. >> However... this also means the new flag on the datetime instances is >> largely irrelevant to pytz. pytz' API will need to remain the same. > > My hope was that 495 alone would at least spare pytz's users from > needing to do a `.normalize()` dance after `.astimezone()` anymore. > Although I'm not clear on why it's needed even now. Instead of one tzinfo instance, there are dozens for your timezone. The datetime implementation does not give pytz the opportunity to choose which one is used when constructing the datetime, so localize is needed to sort that. Similarly, arithmetic does not always give pytz the opportunity to choose which one is used after crossing a timezone boundary, so normalize is needed to sort that out. While the results of the timeline arithmetic are unambiguous and obvious, they are arguably incorrect until normalize puts things right. > See the "PEP-495 - Strict Invalid Time Checking" thread for more. > There seems to be increasing "feature creep" here. Rewriting vast > swaths of datetime internals to cater to this is at best impractical, > especially compared to supplying a "check this datetime" function > users who care can call when they care. Nevertheless, it's a suitable > subject for a different PEP. I don't want to bog 495 down with it. > If it had _stopped_ with asking for an optional check in the datetime > constructor, it may have been implemented already ;-) Yup. I think I'm after hooks to replace localize on construction and normalize after arithmetic, so users don't have to be relied on to do this explicitly. This doesn't need to happen now, and I fully understand this could be considered fast path and the overhead unacceptable. >> The important bit here for pytz is that tzinfo.fromutc() may return a >> datetime with a different tzinfo instance. > > Sorry, didn't follow that. Of course you can write your .fromutc() to > return anything you want. I can't find what concerned me any more. I think there was some wording along the lines of 'the result will be used to initialize the first flag'. What I'm reading now on fromutc() though looks fine, so I think I was mixed up. >>>> - My argument in favour of 'is_dst' over 'first' is that this is what >>>> we have in the data we are trying to load. You commonly have >>>> a timestamp with a timezone abbreviation and/or offset. This can >>>> easily be converted to an is_dst flag. > >>> You mean by using platform C library functions (albeit perhaps wrapped >>> by Python)? > > I really missed an answer to that ;-) I think all the data we have access to, including from platform C library functions, uses the is_dst flag or is simpler to map to the is_dst flag. The C library as exposed by the time.struct_time gives you is_dst. Mapping that to first/fold means first doing doing two conversions and determining which one comes first. Similarly, when loading your JSON file or examining email headers you need to load in a string like '2004-04-04 02:30:00 EDT-05:00'. Its simple to use a lookup table to map the abbreviation + offset to an is_dst flag. Its harder to map it to first/fold because they are swapped around in April and October. And there can be more than two transitions in a year, so if you need to support that your going to need to do the lookup, construct a couple of instances, and compare to work out if EDT or EST comes first that month in that year. But, really, I hate all the options for the flag name. I lean towards is_dst mainly because people are used to it. > But there's every reason to be optimistic: even someone as old and > in-the-way as me doesn't find any of this particularly confusing ;-) I may be old, but at least I'm not as old as Tim ;) -- Stuart Bishop http://www.stuartbishop.net/ From alexander.belopolsky at gmail.com Thu Aug 27 15:51:58 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 27 Aug 2015 09:51:58 -0400 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: <55DEFEE2.7030007@stoneleaf.us> References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> <55DCD499.6020801@oddbird.net> <55DCD84A.5060006@oddbird.net> <55DEFEE2.7030007@stoneleaf.us> Message-ID: On Thu, Aug 27, 2015 at 8:13 AM, Ethan Furman wrote: [Alexander Belopolsky] >> >> >> OK, so datetime module itself will never set fold=-1. In the list below, can you mark the methods that need to be patched to check the fold attribute in your preferred design: [Ethan Furman] datetime.__new__ YES datetime.combine YES (could have a time instance with fold=None) datetime.replace YES datetime.strptime only if `fold=` is allowed [Ethan Furman] > > If `fold=None` is present /when attempting to create a datetime/ that is ambiguous or invalid, an exception is raised /at that moment/ meaning that a datetime with `fold=None` /will never exist/. fold=None` /will not be the default/. I don't grasp the significance of the slashes in your text above, but it looks like your idea is similar to the one I outlined in the "Strict Invalid Time Checking: an idea for another PEP" thread. As I said there, it is workable, but there are many details that need to be thought out before this idea becomes PEP-worthy. Let's move the discussion of those details to a thread that does not have PEP 495 in the subject. Feel free to reuse my thread or open a new one. What we can do in PEP 495 is tighten the language about acceptable values for the fold= argument in replace(). Since pure python implementation currently allows None for year through microsecond arguments, we have the following text in the PEP: """ In CPython, any non-integer value of fold [passed to replace()] will raise a TypeError , but other implementations may allow the value None to behave the same as when fold is not given. """ I am fine with removing this text and leaving fold=None option open for the future PEPs to explore. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Thu Aug 27 16:05:37 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 27 Aug 2015 10:05:37 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On Thu, Aug 27, 2015 at 9:12 AM, Stuart Bishop wrote: > > But, really, I hate all the options for the flag name. I lean towards > is_dst mainly because people are used to it. Ironically, you created one of the reasons I did not want any mention of "dst" in the new attribute: the way pytz's localize() uses is_dst is so different from the way mktime() uses tm_isdst, that creating a third set of rules for a variable with a similar name was out of the question. If you have time, please re-read the answer to "Why not call the new flag 'isdst'?" [1] in the PEP and let me know if it needs any improvements. [1]: https://www.python.org/dev/peps/pep-0495/#why-not-call-the-new-flag-isdst -------------- next part -------------- An HTML attachment was scrubbed... URL: From 4kir4.1i at gmail.com Thu Aug 27 16:38:33 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Thu, 27 Aug 2015 17:38:33 +0300 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: (Stuart Bishop's message of "Thu, 27 Aug 2015 20:12:33 +0700") References: Message-ID: <8761407q52.fsf@gmail.com> Stuart Bishop writes: ... > the complexities and slowdowns of timeline arithmetic. While not > changing the behaviour of datetime at all, we could get cats and dogs > living together by just clarifying what it actually is. An *observation*: the local timezone -- the only timezone with a variable utc offset in _stdlib -- behaves like pytz_: # start with the same utc time utc_time = datetime(2015, 10, 25, 1, tzinfo=timezone.utc) stdlib_time = utc_time.astimezone() # stdlib local time pytz_time = utc_time.astimezone(tzlocal.get_localzone()) dateutil_time = utc_time.astimezone(dateutil.tz.tzlocal()) All times are consistent so far. Perform the same operations: >>> stdlib_time - timedelta(seconds=1) datetime.datetime(2015, 10, 25, 1, 59, 59, tzinfo=datetime.timezone(datetime.timedelta(0, 3600), 'CET')) >>> pytz_time - timedelta(seconds=1) datetime.datetime(2015, 10, 25, 1, 59, 59, tzinfo=) >>> dateutil_time - timedelta(seconds=1) datetime.datetime(2015, 10, 25, 1, 59, 59, tzinfo=tzlocal()) Get different results. The times are the same but utc offset are different: >>> (stdlib_time - timedelta(seconds=1)).utcoffset() datetime.timedelta(0, 3600) >>> (pytz_time - timedelta(seconds=1)).utcoffset() datetime.timedelta(0, 3600) >>> (dateutil_time - timedelta(seconds=1)).utcoffset() #XXX different datetime.timedelta(0, 7200) It is expected for arithmetic in the presense of DST transitions. Here's the standard fix: >>> (stdlib_time - timedelta(seconds=1)).astimezone() datetime.datetime(2015, 10, 25, 2, 59, 59, tzinfo=datetime.timezone(datetime.timedelta(0, 7200), 'CEST')) >>> (pytz_time - timedelta(seconds=1)).astimezone(tzlocal.get_localzone()) datetime.datetime(2015, 10, 25, 2, 59, 59, tzinfo=) >>> (dateutil_time - timedelta(seconds=1)).astimezone(dateutil.tz.tzlocal()) #XXX different datetime.datetime(2015, 10, 25, 1, 59, 59, tzinfo=tzlocal()) >>> datetime(2015, 10, 25, 1, 59, 59, tzinfo=dateutil.tz.tzlocal()).strftime('%Z%z') 'CEST+0200' Now the all timezones are the same but the times are different. pytz recommends normalize() method instead of astimezone() here: >>> tzlocal.get_localzone().normalize(pytz_time - timedelta(seconds=1)) datetime.datetime(2015, 10, 25, 2, 59, 59, tzinfo=) The result is the same in this case. It is a _fact_. It is how python behaves now on my platform. To try it yourself, add this at the top: import os import time from datetime import datetime, timezone, timedelta os.environ['TZ'] = 'Europe/Paris' time.tzset() import dateutil.tz import tzlocal # get local timezone as pytz tzinfo From alexander.belopolsky at gmail.com Thu Aug 27 17:00:12 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 27 Aug 2015 11:00:12 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: <8761407q52.fsf@gmail.com> References: <8761407q52.fsf@gmail.com> Message-ID: On Thu, Aug 27, 2015 at 10:38 AM, Akira Li <4kir4.1i at gmail.com> wrote: > An *observation*: the local timezone -- the only timezone with a variable > utc offset in _stdlib -- behaves like pytz_: > > # start with the same utc time > utc_time = datetime(2015, 10, 25, 1, tzinfo=timezone.utc) > stdlib_time = utc_time.astimezone() # stdlib local time > pytz_time = utc_time.astimezone(tzlocal.get_localzone()) > I have no idea where you get tzlocal from, but I can assure you that there is no such thing in the standard library. What we do have is an option to call .astimezone() without a tzinfo argument and get an aware datetime instance with tzinfo set to a fixed offset timezone. In this respect, .astimezone() is indeed similar to pytz's localize()/normalize(), but it is much simpler. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Thu Aug 27 17:07:41 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 27 Aug 2015 11:07:41 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: <8761407q52.fsf@gmail.com> References: <8761407q52.fsf@gmail.com> Message-ID: On Thu, Aug 27, 2015 at 10:38 AM, Akira Li <4kir4.1i at gmail.com> wrote: > pytz recommends normalize() method instead of astimezone() here: > > >>> tzlocal.get_localzone().normalize(pytz_time - timedelta(seconds=1)) > datetime.datetime(2015, 10, 25, 2, 59, 59, tzinfo= 'Europe/Paris' CEST+2:00:00 DST>) > > The result is the same in this case. > > It is a _fact_. It is how python behaves now on my platform. > Isn't it surprising that when you use two different libraries according to their respective specifications you get the same correct result! -------------- next part -------------- An HTML attachment was scrubbed... URL: From stuart at stuartbishop.net Thu Aug 27 18:25:38 2015 From: stuart at stuartbishop.net (Stuart Bishop) Date: Thu, 27 Aug 2015 23:25:38 +0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On 27 August 2015 at 21:05, Alexander Belopolsky wrote: > > On Thu, Aug 27, 2015 at 9:12 AM, Stuart Bishop > wrote: >> >> But, really, I hate all the options for the flag name. I lean towards >> is_dst mainly because people are used to it. > > Ironically, you created one of the reasons I did not want any mention of > "dst" in the new attribute: the way pytz's localize() uses is_dst is so > different from the way mktime() uses tm_isdst, that creating a third set of > rules for a variable with a similar name was out of the question. > > If you have time, please re-read the answer to "Why not call the new flag > 'isdst'?" [1] in the PEP and let me know if it needs any improvements. > > [1]: > https://www.python.org/dev/peps/pep-0495/#why-not-call-the-new-flag-isdst In my experience, the first conversation would actually read: Alice: Hey, lets watch the meteor shower at 1:30am Bob: Is that 1:30am daylight savings time? Alice: Hmm... yes, the paper says daylight savings time. I think people are more aware of daylight savings time than you imagine, given the nagging by the media on how to adjust their clocks. Most of the argument against isdst also applies to first/fold. I think you are more likely to know if you want dst or not than you are to know if you want the first or second period. Most of my input comes from other computer systems, and will give me an offset. This is more easily mapped to the is_dst flag than the fold flag. If you have the dst flag, then you need details about the timezone to calculate the fold flag. In all cases, if you chose not to or are unable to specify an explicit flag you are forced to guess and will get the incorrect answer 50% of the time. While most of the time you will just guess with either approach, I think you are more likely to get disambiguation information from a computer in the form of an offset or is_dst flag than you are to get disambiguation information from a human. I have billions of rows of data in my databases. If I'm dealing with input from computer systems, such as my databases, I will have billions of conversions to do and the overhead of calculating the fold flag is billions of times worse than calculating the isdst flag from the non-bulk human input. Yes, there is a subtle difference between how tm_isdst is handled and how an is_dst flag in the datetime module should be handled, in that tm_isdst will force the use of the dst timezone or not even if it would normally not be applied to that period. Personally, I find the behaviour close enough that the similar names would be a benefit. Yes, the dst() method returning the dst offset and the is_dst flag are have similar names, which is fine as they have similar concepts. The proposed type checking would ensure that any misinterpretation was quickly picked up, as attempting to pass timedelta(0) as the value of is_dst would raise an exception. So I think the argument against is_dst is weak. That said, I can't make a strong argument in favour of it either. I still lean in favour of is_dst myself, but I'm only leaning. -- Stuart Bishop http://www.stuartbishop.net/ From stuart at stuartbishop.net Thu Aug 27 18:27:26 2015 From: stuart at stuartbishop.net (Stuart Bishop) Date: Thu, 27 Aug 2015 23:27:26 +0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: <8761407q52.fsf@gmail.com> Message-ID: On 27 August 2015 at 22:00, Alexander Belopolsky wrote: > I have no idea where you get tzlocal from, but I can assure you that there > is no such thing in the standard library. Consider tzlocal as part of pytz. It is Lennart's work, designed to work with pytz and would have been part of it except I felt it better to keep the scope narrowly defined. tzlocal.get_localzone() will load the correct dst aware local IANA zoneinfo file wherever it is defined on the platform, and I believe also gets you a working local dst aware tzinfo on Windows. -- Stuart Bishop http://www.stuartbishop.net/ From alexander.belopolsky at gmail.com Thu Aug 27 18:34:02 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 27 Aug 2015 12:34:02 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On Thu, Aug 27, 2015 at 12:25 PM, Stuart Bishop wrote: > I think people are more aware of daylight savings time than you > imagine, given the nagging by the media on how to adjust their clocks. > I am afraid this is a very US-centric view. Try to explain the concept to someone from a country that does not mess with the clocks. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stuart at stuartbishop.net Thu Aug 27 18:54:29 2015 From: stuart at stuartbishop.net (Stuart Bishop) Date: Thu, 27 Aug 2015 23:54:29 +0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On 27 August 2015 at 23:34, Alexander Belopolsky wrote: > I am afraid this is a very US-centric view. Try to explain the concept to > someone from a country that does not mess with the clocks. People in countries that do not mess with the clocks will never be wondering about if the dst disambiguation flag should be set or not. This is about humans making a decision in the cases where it matters to them. (Currently at UTC+7 all year around, where people only experience DST changes in the form of satellite TV schedules). -- Stuart Bishop http://www.stuartbishop.net/ From tim.peters at gmail.com Thu Aug 27 18:54:38 2015 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 27 Aug 2015 11:54:38 -0500 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: [Stuart] >> I think people are more aware of daylight savings time than you >> imagine, given the nagging by the media on how to adjust their clocks. [Alexander] > I am afraid this is a very US-centric view. If I'm not mistaken, Stewart is Australian. So it's a very AUS- centric view ;-) > Try to explain the concept to someone from a country that > does not mess with the clocks. Heh. Try explaining the concept to 99.9% of humanity ;-) I live near farm country. Every year, when DST ends in late fall, there are "letters to the editor" complaining that the late crops really _need_ more sunlight - why can't the damn fool city-slicker politicians just leave the sun alone, as God created it? From carl at oddbird.net Thu Aug 27 19:00:41 2015 From: carl at oddbird.net (Carl Meyer) Date: Thu, 27 Aug 2015 11:00:41 -0600 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: <55DF4239.2090404@oddbird.net> On 08/27/2015 10:54 AM, Tim Peters wrote: > Heh. Try explaining the concept to 99.9% of humanity ;-) I live near > farm country. Every year, when DST ends in late fall, there are > "letters to the editor" complaining that the late crops really _need_ > more sunlight - why can't the damn fool city-slicker politicians just > leave the sun alone, as God created it? I spent some time in rural southern Mexico, where people referred to DST as "government time" and non-DST as "God's time." Most local affairs were scheduled on "God's time," even when DST was supposed to be in effect. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Thu Aug 27 19:04:09 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 27 Aug 2015 13:04:09 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On Thu, Aug 27, 2015 at 12:25 PM, Stuart Bishop wrote: > > I have billions of rows of data in my databases. If I'm dealing with > input from computer systems, such as my databases, I will have > billions of conversions to do and the overhead of calculating the fold > flag is billions of times worse than calculating the isdst flag from > the non-bulk human input. Do you realize that a typical portable implementation of mktime calls localtime up to four times to get tm_isdst? And to get it completely "right", needs up to six calls? [1] I am sure I can do no worse than that computing the fold value. [1]: "BUG in mktime" https://www.sourceware.org/ml/libc-hacker/1998-10/msg00027.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Thu Aug 27 19:10:23 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 27 Aug 2015 13:10:23 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: <55DF4239.2090404@oddbird.net> References: <55DF4239.2090404@oddbird.net> Message-ID: On Thu, Aug 27, 2015 at 1:00 PM, Carl Meyer wrote: > I spent some time in rural southern Mexico, where people referred to DST > as "government time" and non-DST as "God's time." > Interesting. In Russia it was sometimes called the "decree" time even though historically the "decree time" was something different. (And guess what: the spring-forward/fall-back mnemonics don't help non-English speakers.) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Thu Aug 27 19:33:13 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 27 Aug 2015 13:33:13 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On Thu, Aug 27, 2015 at 12:54 PM, Tim Peters wrote: > [Alexander] > > I am afraid this is a very US-centric view. > > If I'm not mistaken, Stewart is Australian. So it's a very AUS- > centric view ;-) > They are the lucky ones: "S" stands for "Summer" in their "EST", but in ours it stands for "Sommerzeitkaputt." Pop quiz: what is_dst value does CEST imply? -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Thu Aug 27 19:53:48 2015 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 27 Aug 2015 12:53:48 -0500 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: [Alexander] > Do you realize that a typical portable implementation of mktime calls > localtime up to four times to get tm_isdst? And to get it completely > "right", needs up to six calls? [1] I am sure I can do no worse than that > computing the fold value. Related: last night I was "amused" to see this in Gustavo's dateutil's tz.py's tzlocal implementation: def _isdst(self, dt): # We can't use mktime here. It is unstable when deciding if # the hour near to a change is DST or not. followed by examples from his native Brazilian timezone, then his own code to try to get it right. mktime is a frickin' cross-platform mess, and anyone determined to get transitions exactly right in all cases on all platforms should be scared to death of using it. Stewart, I still don't grasp what your problem is. The only concrete example I've seen is dealing with this string: 2004-10-31 01:15 EST-05:00 where you knew "EST" meant US/Eastern, but you haven't explained exactly what you're trying to _do_ with it. If you're trying to create an aware datetime out of it in a post-PEP-495 pytz, then "the obvious" way is: 1. Create a UTC datetime out of "2004-10-31 01:15" alone. 2. Create timedelta(hours=-5) out of "-05:00" alone. 3. Subtract the result of #2 from the result of #1, to convert to UTC for real. 4. Invoke .astimezone() on the result of #3, passing pytz's spelling of "US/Eastern". You (the user doing this) don't have to compute - or even know anything about - "fold" yourself. pytz's .fromutc() is the only part that has to know about "fold". The whole dance doesn't require invoking _any_ dodgy platform C library date functions. Absolutely everything known about transitions in this dance comes from pytz's wrapping of zoneinfo, so you're (Stuart) entirely in control of what happens, and all pytz users will get exactly the same result on all platforms. And none of it "should be" notably expensive. If that doesn't answer it, what - exactly - _are_ you trying to do with that string? From guido at python.org Thu Aug 27 20:14:39 2015 From: guido at python.org (Guido van Rossum) Date: Thu, 27 Aug 2015 11:14:39 -0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On Thu, Aug 27, 2015 at 10:53 AM, Tim Peters wrote: > [...] > Stewart, I still don't grasp what your problem is. The only concrete > example I've seen is dealing with this string: > > 2004-10-31 01:15 EST-05:00 > > where you knew "EST" meant US/Eastern, but you haven't explained > exactly what you're trying to _do_ with it. If you're trying to > create an aware datetime out of it in a post-PEP-495 pytz, then "the > obvious" way is: > > 1. Create a UTC datetime out of "2004-10-31 01:15" alone. > 2. Create timedelta(hours=-5) out of "-05:00" alone. > 3. Subtract the result of #2 from the result of #1, to convert to UTC for > real. > 4. Invoke .astimezone() on the result of #3, passing pytz's spelling > of "US/Eastern". Funny, I would have done it this way (but the outcome should be the same): 1. Create a naive datetime out of "2004-10-31 01:15" alone. 2. Create timedelta(hours=-5) out of "-05:00" alone. 3. Subtract the result of #2 from the result of #1 to get the UTC time as a naive datetime. 4. Use .replace(tzinfo=datetime.timezone.utc) to mark the result as UTC. 5. Invoke .astimezone() on the result of #3, passing pytz's spelling of "US/Eastern". Only #5 requires pytz (though you could use pytz.utc in #4 -- it doesn't matter). My reason for the extra step is philosophical: the datetime created in the first step is just a (date, time) combo that *doesn't know* its timezone yet. Only after step 3 do we have the designated point in time, so then we can mark it as UTC. Of course, since the result is the same, if Tim's version is faster that's a fine way to do it -- but IMO mine makes it clearer what's going on. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Thu Aug 27 20:32:34 2015 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 27 Aug 2015 13:32:34 -0500 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: [Tim] >> [...] >> Stewart, I still don't grasp what your problem is. The only concrete >> example I've seen is dealing with this string: >> >> 2004-10-31 01:15 EST-05:00 >> >> where you knew "EST" meant US/Eastern, but you haven't explained >> exactly what you're trying to _do_ with it. If you're trying to >> create an aware datetime out of it in a post-PEP-495 pytz, then "the >> obvious" way is: >> >> 1. Create a UTC datetime out of "2004-10-31 01:15" alone. >> 2. Create timedelta(hours=-5) out of "-05:00" alone. >> 3. Subtract the result of #2 from the result of #1, to convert to UTC for >> real. >> 4. Invoke .astimezone() on the result of #3, passing pytz's spelling >> of "US/Eastern". [Guido] > Funny, I would have done it this way (but the outcome should be the same): > > 1. Create a naive datetime out of "2004-10-31 01:15" alone. > 2. Create timedelta(hours=-5) out of "-05:00" alone. > 3. Subtract the result of #2 from the result of #1 to get the UTC time as a > naive datetime. > 4. Use .replace(tzinfo=datetime.timezone.utc) to mark the result as UTC. > 5. Invoke .astimezone() on the result of #3, passing pytz's spelling > of "US/Eastern". > > Only #5 requires pytz (though you could use pytz.utc in #4 -- it doesn't > matter). > > My reason for the extra step is philosophical: the datetime created in the > first step is just a (date, time) combo that *doesn't know* its timezone > yet. Only after step 3 do we have the designated point in time, so then we > can mark it as UTC. > > Of course, since the result is the same, if Tim's version is faster that's a > fine way to do it -- but IMO mine makes it clearer what's going on. That's fine - I'm trying to get across the idea, not suggest a specific implementation, and 4 steps are one less than 5 ;-) An actual speed-conscious implementation would create the datetime with the US/Eastern timezone in my step #1 already. Classic arithmetic still gets the right result in #3 (timedelta subtraction ignores the tzinfo, except just to copy it into the result). The _point_ is that then step #4 can call the cheaper .fromutc() instead (which requires that the datetime invoking it already has the destination tzinfo). You could do the same in yours by changing your #4 to attach US/Eastern, and changing #5 to just invoke .fromutc(). It's a little cheaper to attach the destination zone in #1, though (in all, requires creating one fewer datetime object). From alexander.belopolsky at gmail.com Thu Aug 27 20:59:02 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 27 Aug 2015 14:59:02 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On Thu, Aug 27, 2015 at 2:32 PM, Tim Peters wrote: > [Tim] > >> [...] > >> Stewart, I still don't grasp what your problem is. The only concrete > >> example I've seen is dealing with this string: > >> > >> 2004-10-31 01:15 EST-05:00 > .. > [Tim] > > That's fine - I'm trying to get across the idea, not suggest a > specific implementation, and 4 steps are one less than 5 ;-) > Dealing with simple time strings like this should really be one step: >>> ts = '2004-10-31 01:15 EST-05:00' >>> dt = datetime.strptime(ts.replace('EST', '').replace(':', ''), '%Y-%m-%d %H%M %z') >>> print(dt) 2004-10-31 01:15:00-05:00 It is unfortunate that we need to massage the input like this before passing it to datetime.strptime(). Ideally, datetime.strptime() should have a way to at least parse -05:00 as a fixed offset timezone. However, this problem has nothing to do with PEP 495. Once you have the UTC offset - there is no ambiguity. The other 01:15 would be "2004-10-31 01:15 -04:00." The EST part is redundant and is dropped explicitly in my solution and not used in the solutions by Tim and Guido. -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Thu Aug 27 21:10:45 2015 From: carl at oddbird.net (Carl Meyer) Date: Thu, 27 Aug 2015 13:10:45 -0600 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: <55DF60B5.6080703@oddbird.net> On 08/27/2015 12:59 PM, Alexander Belopolsky wrote: > Dealing with simple time strings like this should really be one step: > >>>> ts = '2004-10-31 01:15 EST-05:00' >>>> dt = datetime.strptime(ts.replace('EST', '').replace(':', ''), > '%Y-%m-%d %H%M %z') >>>> print(dt) > 2004-10-31 01:15:00-05:00 > > It is unfortunate that we need to massage the input like this before > passing it to datetime.strptime(). Ideally, datetime.strptime() should > have a way to at least parse -05:00 as a fixed offset timezone. > > However, this problem has nothing to do with PEP 495. Once you have the > UTC offset - there is no ambiguity. The other 01:15 would be > "2004-10-31 01:15 -04:00." The EST part is redundant and is dropped > explicitly in my solution and not used in the solutions by Tim and Guido. I don't think that's true; it's not entirely ignored in Tim and Guido's solution, and your solution gives subtly different results. A datetime with a fixed-offset -0500 tzinfo and a datetime with a tzinfo that knows it is "US/Eastern" may represent the same instant, but they are semantically different. That difference could reveal itself after some arithmetic with the datetime (because in one case the tzinfo's utcoffset might change after the arithmetic, and in the other case it wouldn't). (Of course if it's a pytz timezone the offset wouldn't change until after a `normalize()`, but that's not necessarily true of any tzinfo implementation.) Guido and Tim don't provide code to parse "EST-05:00" to "US/Eastern", but their solutions both rely on assuming that knowledge. Tim said so explicitly: 'where you knew "EST" meant US/Eastern' Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 819 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Thu Aug 27 21:17:57 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 27 Aug 2015 15:17:57 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: <55DF60B5.6080703@oddbird.net> References: <55DF60B5.6080703@oddbird.net> Message-ID: On Thu, Aug 27, 2015 at 3:10 PM, Carl Meyer wrote: > Guido and Tim don't provide code to parse "EST-05:00" to "US/Eastern", > but their solutions both rely on assuming that knowledge. Tim said so > explicitly: 'where you knew "EST" meant US/Eastern' > OK. Add Tim's step #4 or Guido's step #5 to my solution. Since those are the same, I did not feel compelled to show the same step for the third time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Thu Aug 27 21:21:39 2015 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 27 Aug 2015 14:21:39 -0500 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: [Alexander Belopolsky ] > Dealing with simple time strings like this should really be one step: > > >>> ts = '2004-10-31 01:15 EST-05:00' > >>> dt = datetime.strptime(ts.replace('EST', '').replace(':', ''), '%Y-%m-%d > >>> %H%M %z') > >>> print(dt) > 2004-10-31 01:15:00-05:00 > > It is unfortunate that we need to massage the input like this before passing > it to datetime.strptime(). Ideally, datetime.strptime() should have a way > to at least parse -05:00 as a fixed offset timezone. > > However, this problem has nothing to do with PEP 495. Once you have the UTC > offset - there is no ambiguity. The other 01:15 would be "2004-10-31 01:15 > -04:00." The EST part is redundant and is dropped explicitly in my solution > and not used in the solutions by Tim and Guido. You're solving a different problem, though. I keep guessing at what Stuart really wants in the end, but my _best_ guess is that he wants an aware datetime in US/Eastern, not an aware datetime in some anonymous eternally-fixed-offset-of-minus-5-hours timezone. If he really wanted a fixed-offset zone, then I can't imagine why he would pick a string so unlikely as to just happen to show an ambiguous time in US/Eastern. The relation to PEP 495 is the repeated claims that somehow or other it's easy to compute is_dst from the original string but very difficult (or something) to compute "fold". The points of the last few messages were: (1) mktime is a neither cheap nor reliable (in all cases) way to compute is_dst; and, (2) a user doesn't even have to know about "fold" to get a US/Eastern datetime in pytz from that string; and, (3) it's not expensive and is wholly reliable (in all cases) to compute "fold" in a post-PEP-495 pytz if for some unfathomable (to me ;-) ) reason Stewart really _does_ want to compute fold for its own sake from that string (do the dance I gave, then throw away everything except the fold bit). And I agree that extending .strptime() is irrelevant to the PEP ;-) From alexander.belopolsky at gmail.com Thu Aug 27 21:37:44 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 27 Aug 2015 15:37:44 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: On Thu, Aug 27, 2015 at 3:21 PM, Tim Peters wrote: > (1) mktime is a neither cheap nor reliable (in all cases) way to compute > is_dst; and, > Moreover, there are cases where it is not even possible to tell whether DST is in effect or not. For example, when Moscow switched from permanent summer time to permanent winter time last year and Russia revised its entire map of 7 time zones they did not tell anyone what the new value of tm_isdst should be. Taking my favorite example of 1990 Kiev, Ukraine. Running on Linux: $ zdump -v -c 1992 Europe/Kiev| grep 1990 Europe/Kiev Sat Jun 30 22:59:59 1990 UTC = Sun Jul 1 01:59:59 1990 MSK isdst=0 gmtoff=10800 Europe/Kiev Sat Jun 30 23:00:00 1990 UTC = Sun Jul 1 01:00:00 1990 EET isdst=0 gmtoff=7200 Running on a Mac: $ zdump -v Europe/Kiev| grep 1990 Europe/Kiev Sat Mar 24 22:59:59 1990 UTC = Sun Mar 25 01:59:59 1990 MSK isdst=0 Europe/Kiev Sat Mar 24 23:00:00 1990 UTC = Sun Mar 25 03:00:00 1990 MSD isdst=1 Europe/Kiev Sat Jun 30 21:59:59 1990 UTC = Sun Jul 1 01:59:59 1990 MSD isdst=1 Europe/Kiev Sat Jun 30 22:00:00 1990 UTC = Sun Jul 1 01:00:00 1990 EEST isdst=1 >From either of the output above, can you tell me what color was the bear? -------------- next part -------------- An HTML attachment was scrubbed... URL: From 4kir4.1i at gmail.com Thu Aug 27 21:57:50 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Thu, 27 Aug 2015 22:57:50 +0300 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: <55DF60B5.6080703@oddbird.net> (Carl Meyer's message of "Thu, 27 Aug 2015 13:10:45 -0600") References: <55DF60B5.6080703@oddbird.net> Message-ID: <87vbc05wsh.fsf@gmail.com> Carl Meyer writes: > On 08/27/2015 12:59 PM, Alexander Belopolsky wrote: >> Dealing with simple time strings like this should really be one step: >> >>>>> ts = '2004-10-31 01:15 EST-05:00' >>>>> dt = datetime.strptime(ts.replace('EST', '').replace(':', ''), >> '%Y-%m-%d %H%M %z') >>>>> print(dt) >> 2004-10-31 01:15:00-05:00 >> >> It is unfortunate that we need to massage the input like this before >> passing it to datetime.strptime(). Ideally, datetime.strptime() should >> have a way to at least parse -05:00 as a fixed offset timezone. >> >> However, this problem has nothing to do with PEP 495. Once you have the >> UTC offset - there is no ambiguity. The other 01:15 would be >> "2004-10-31 01:15 -04:00." The EST part is redundant and is dropped >> explicitly in my solution and not used in the solutions by Tim and Guido. > > I don't think that's true; it's not entirely ignored in Tim and Guido's > solution, and your solution gives subtly different results. A datetime > with a fixed-offset -0500 tzinfo and a datetime with a tzinfo that knows > it is "US/Eastern" may represent the same instant, but they are > semantically different. That difference could reveal itself after some > arithmetic with the datetime (because in one case the tzinfo's utcoffset > might change after the arithmetic, and in the other case it wouldn't). > (Of course if it's a pytz timezone the offset wouldn't change until > after a `normalize()`, but that's not necessarily true of any tzinfo > implementation.) > > Guido and Tim don't provide code to parse "EST-05:00" to "US/Eastern", > but their solutions both rely on assuming that knowledge. Tim said so > explicitly: 'where you knew "EST" meant US/Eastern' EST was ambiguous in the past [1]. But now the pytz code from the answer returns a single utc offset and many timezones that use EST abbreviation. Obviously, even if the corresponding regions have the same utc now; it doesn't mean that they use the same rules e.g., DST transitions may occur at different time in different timezones that have the same utc offset now. Even if the timezones use the same rules now; they might have used different rules in the past and they may use different rules in the future. Europe is a good example. Even if you know the zone id such as Europe/Moscow; different tzdata versions may provide different rules for the same region i.e., if you convert the time to UTC for storage then it is not enough to save the zone id, to restore the same local time in the future. The timezone abbreviation may be ambiguous by itself e.g., if tzname='CST' then the code [1] produces 3 different utc offsets for the same time (the corresponding representative timezones are America/Havana, US/Central, Asia/Shanghai). [1] http://stackoverflow.com/questions/13707545/linux-convert-timefor-different-timezones-to-utc From alexander.belopolsky at gmail.com Thu Aug 27 22:07:23 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Thu, 27 Aug 2015 16:07:23 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: <87vbc05wsh.fsf@gmail.com> References: <55DF60B5.6080703@oddbird.net> <87vbc05wsh.fsf@gmail.com> Message-ID: On Thu, Aug 27, 2015 at 3:57 PM, Akira Li <4kir4.1i at gmail.com> wrote: > EST was ambiguous in the past [1]. But now the pytz code from the answer > returns a single utc offset and many timezones that use EST > abbreviation. > Look, even though Einstein's theory of relativity tells us that time and space are the same, PEP 495 is only trying to disambiguate times, not places. You will not be able to use the fold value to differentiate between Moscow and Nizhny Novgorod. Any support for that will have to be covered in another PEP. -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Fri Aug 28 03:17:06 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Thu, 27 Aug 2015 18:17:06 -0700 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> <55DCD499.6020801@oddbird.net> <55DCD84A.5060006@oddbird.net> <55DEFEE2.7030007@stoneleaf.us> Message-ID: <55DFB692.4040101@stoneleaf.us> On 08/27/2015 06:51 AM, Alexander Belopolsky wrote: > """ > In CPython, any non-integer value of fold [passed to replace()] will raise a TypeError , but other implementations may allow the value None to behave the same as when fold is not given. > """ > > I am fine with removing this text and leaving fold=None option open for the future PEPs to explore. Sounds like a good compromise. Thank you. -- ~Ethan~ From guido at python.org Fri Aug 28 04:09:13 2015 From: guido at python.org (Guido van Rossum) Date: Thu, 27 Aug 2015 19:09:13 -0700 Subject: [Datetime-SIG] PEP-495 - Strict Invalid Time Checking In-Reply-To: <55DFB692.4040101@stoneleaf.us> References: <55DCC3CE.7070200@oddbird.net> <55DCCC55.2040908@oddbird.net> <55DCD068.9020604@oddbird.net> <55DCD499.6020801@oddbird.net> <55DCD84A.5060006@oddbird.net> <55DEFEE2.7030007@stoneleaf.us> <55DFB692.4040101@stoneleaf.us> Message-ID: Honestly, rather than weasel-wording the PEP to keep the option open to assign a different meaning to fold=None in the future, whatever semantics people would like should just be given a new keyword or a new method. On Thu, Aug 27, 2015 at 6:17 PM, Ethan Furman wrote: > On 08/27/2015 06:51 AM, Alexander Belopolsky wrote: > > """ >> In CPython, any non-integer value of fold [passed to replace()] will >> raise a TypeError , but other implementations may allow the value None to >> behave the same as when fold is not given. >> """ >> >> I am fine with removing this text and leaving fold=None option open for >> the future PEPs to explore. >> > > Sounds like a good compromise. Thank you. > > > -- > ~Ethan~ > _______________________________________________ > Datetime-SIG mailing list > Datetime-SIG at python.org > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Fri Aug 28 04:22:01 2015 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 27 Aug 2015 21:22:01 -0500 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: <87zj1d7usy.fsf@gmail.com> Message-ID: [Stuart Bishop ] > The fixed-offset classes and sorting arithmetic were the only way to > get things round tripping. Arithmetic has nothing to do with round tripping. I can't explain that any better than I already have. Therefore it must already be clear. Therefore I'll go on to pretend it isn't ;-) > Take a datetime. Convert it to another timezone. That _alone_ can't always work right. I gave you a specific, detailed example before. More generally. pick _any_ case where a UTC time `u` corresponds to an ambiguous time in some other timezone T. Convert `u` to time `t` in T and back to UTC again. The chance of getting `u` back again is in one in two. Without a disambiguation bit, it's impossible for the conversion back to UTC to know which UTC time this dance started with. That's what "ambiguous time" _means_: there's more one UTC spelling of `t`. Without a bit to specify _which_ UTC time `t` was derived from, it's impossible to do better than just guess which UTC time was intended. > Add one hour to both. But that's off in the weeds. If you use a form of arithmetic inappropriate for the problem you're trying to solve, _of course_ that's going to cause problems. Use, e.g., integer arithmetic for a problem that requires floating arithmetic, and nothing good can come of it. Likewise, in many cases, using floating arithmetic for a problem that requires integer arithmetic. Same thing using classic arithmetic for a problem that requires timeline arithmetic. That's "pilot error". > Compare. The results were inconsistent. Conversion alone can cause problems. Using inappropriate arithmetic is a distinct source of "garbage in, garbage out". > You would only get correct results with fixed offset timezones, If appropriate (timeline) arithmetic had been used instead to "add an hour", then only the errors due to conversion endcases would have remained. > because the builtin arithmetic ignored the timezone, > because there was no is_dst flag and without it it is impossible to > get correct results. is_dst is necessary and sufficient to repair the conversion errors. Sorry, but all other errors were self-inflicted (using inappropriate arithmetic). Granted, the Python docs never did scream about this. As Guido said in an earlier message, we pretty much just assumed people who wanted timeline arithmetic would use UTC, or plain old timestamps, instead. And they still should. It's easy to write 1-line Python functions to implement timeline arithmetic (modulo that errors due to conversion alone still remain), but that's a grossly inefficient way to avoid best practice too. > The burden was left on tzinfo implementations to deal with the problem. There's more than one problem here. > You could have naive times and do arithmetic correctly, At least that part got communicated ;-) > or you could have zone aware times and do conversions correctly, No. Not in all cases. See the start of this msg. > but to do both developers had to always convert to and from > utc to do the arithmetic. Using timeline arithmetic removes all errors _due_ to using inappropriate arithmetic, but is of no help for the errors due to conversion alone. PEP 495 aims to fix the latter. There is no "arithmetic problem" beyond programmers needlessly shooting kittens in their cute, furry heads. > And developers being lazy creatures wouldn't bother because it > would normally work, or even always work in their particular > timezone, and systems would crash at 4am killing innocent > kittens. This isn't unique to datetime code. Programmers who don't learn and adopt best practices are responsible for a great deal of damage in the real world. No programming language can stop that (although some academic ones have made heroic efforts). > And this was a problem with my tzinfo implementation, because > the only way you could possibly experience the problem was by using my > tzinfo implementation. Python had avoided this clearly documented > problem by not supplying any tzinfo implementations, even though it > would have been easy to create a 'local' one using the information > already exposed in the time module, and I'd always assumed that fixing > it was a requirement of adding timezone implementations to the > standard library. So I fixed it. I do admire the hack! But the magic it's trying to perform strikes me as more of an "attractive nuisance" than a real aid to writing correct code. If users needing timeline arithmetic _did_ bite the bullet and work in UTC internally, they would first find out it's a very small & squishy bullet to bite, easy to swallow and digest. That's ain't no Unicode nightmare. A simple .astimezone() on input & output and they're golden. In return they'd enjoy cleaner, more maintainable, shorter, and more likely correct code. It would run faster too. As is, what if they forget a .normalize()? Try to use datetimes obtained from other packages? Try to pass pytz datetimes _to_ other packages? Forget to check after a .replace() or .combine() or ...? Of course you can't possibly prevent programmers from slaughtering kittens either. But in return for enabling a lazy programmer to avoid using UTC, that programmer has to litter their code with .localize() and.normalize() calls. pytz did a real service for those who couldn't afford _any_ errors in conversions alone, and by wrapping the Olson database, but I don't think it does any _real_ favors by making slow & complex timeline arithmetic more attractive to the terminally lazy. > Drunk on my own cleverness and relative youth, it never occurred > to me that it was possible to rationalize the existing behaviour > with a straight face, where after going to all the effort of constructing > and adding a tzinfo to your datetime it would sit there entirely ignored > by Python, except for conversion operations, If you think _you're_ drunk on your own cleverness, try working for Guido ;-) In any case, catering to timeline arithmetic was not a use case for datetime's design. Conversions were. Believe it or not, across datetime's extensive public design phase, timeline arithmetic barely came up. > consistently giving you answers that are demonstrably incorrect using > most modern timekeeping systems. I'm still not capable of conjuring > up such a monumental rationalization ;) That's why I repeat mine so often. Pretty soon you'll be able to just cut & paste pieces of mine to create trillions of rationalizations that all sound remarkably similar yet are provably distinct ;-) ... >>> As far as I know, normalize() is not necessary after astimezone() even >>> now >>> https://answers.launchpad.net/pytz/+question/249229 > Yeah, I'm putting off answering that one because I'm not sure if I'll > get the answer right. People sometimes think I actually know what I'm > doing. I'll have a look after I get the overdue pytz release out. My guess is it all depends on what your .fromutc() does, since that's the last step .astimezone() performs. That is, if your .fromutc() attaches "the right" tzinfo, then .astimezone() inherits that goodness. From tim.peters at gmail.com Fri Aug 28 05:18:20 2015 From: tim.peters at gmail.com (Tim Peters) Date: Thu, 27 Aug 2015 22:18:20 -0500 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: <87fv356k5i.fsf@gmail.com> References: <87zj1d7usy.fsf@gmail.com> <87fv356k5i.fsf@gmail.com> Message-ID: [Akira Li <4kir4.1i at gmail.com>] ... > I agree on the best practices here. I would prefer that __add__ would be > forbidden for local timezones unless they have a fixed utc offset. But > it might be too late for that now. It's many years too late to change anything about datetime arithmetic in any backward-incompatible way. > If __add__ is allowed for timezone-aware datetime objects It already is. > then arithmetic "as though via conversion to utc time" is *equally valid* as > the arithmetic "as though it is a timezone-naive datetime object".aa __add__ can only mean one of them. And it already does. It _could_ have meant the other, but it doesn't. ... >> Apps that care about leap seconds _should_ be using TAI. Apps that >> want timeline arithmetic _should_ be using UTC. Unfortunately, people >> shoot themselves in the feet all the time. Python can't stop that. >> But it doesn't have to _cater_ to poor practices either. > By your logic: Apps that care about timezone-naive arithmetic _should_ > be using naive datetime objects. As I've said several times before, that's indeed what I would have _preferred_. But it was only a mild preference. Since timeline arithmetic is far better done in UTC anyway, I'm "happy enough" with aware datetimes using classic arithmetic. Indeed, my own code uses that frequently, and so does datetime's implementation. Several examples of that have been given in other messages. > I agree it is a poor practice to perform arithmetic on localized time. > But as long as such arithmetic *is* allowed then it *is* ambiguous what > type of arithmetic should be used. There is no *one obvious* way here. Sure. But this isn't a case of "in the face of ambiguity refuse the temptation to guess". It's a case of "in the face of ambiguity, pick one, document the choice, and move on". Similarly, when printing a floating point number, there are _many_ equally valid choices for how many digits to display after the decimal point. That's no argument for refusing to print floats. Python has made different choices about that over time. If you don't like Python's default choice, with enough extra work you can force any number of digits you like. And if you don't like classic arithmetic, with enough extra work you can get any other kind of datetime arithmetic you like. >>> ... >>> dateutil doesn't work during DST transitions but PEP 495 might allow to >>> fix it. ... > I've linked to a couple of dateutil bugs previously in PEP-431/495 > thread [1] > > I was surprised as you that dateutil .fromutc() appears to be broken. > > I use "might" because I haven't read dateutil code. I can't be sure > e.g., what backward-compatibility concerns might prevent PEP 495 fix its > issues with an ambigous local time. Sorry, I haven't been (and won't be) making time to stare at bugs in other packages. I probably know less about them than you do anyway. Python has always had tests for proper DST transition UTC->local conversions for the kind of tzinfo classes the docs suggest writing. But do note that "proper" in this context _only_ means "mimics the local clock" (gets the repeated or missing YYYY-MM-DD HH:MM:SS behaviors right). Without a disambiguation flag, it's flatly impossible to always get the zone name and/or UTC offset right for ambiguous local-clock hours. > Timezones is a very complicated > topic -- no solution works in the general case.a ? Timezone transitions are mathematically trivial. They're just full of lumps (irregularities). > ... > The only my objection that timezone-naive arithmetic is somehow superior > for localized times. It's not only superior, it's essential for some purposes. For other purposes, it''s worse than useless. > Though I don't mind it as long as timezone conversions would work. Good! That's precisely what PEP 495 intends to make possible. > ... >>> If dateutil can be fixed to work correctly using the disambiguation flag >>> then its behavior is preferable because it eliminates localize, >>> normalize calls > ... > The key word here is "If". *If* it works; great. I can't speak for dateutil or its author. But we're not doing brain surgery here. Conversion just isn't a deep problem. The only non-trivial cases involve ambiguous times. C fixed that long before Python existed with is_dst, although granted that mktime() is notoriously flaky across platforms in edge cases. > It is still possible to perform both types of arithmetic as the examples > above demonstrate. That's always been possible - except for errors in timeline arithmetic inherited from the rare failing conversion cases. 495 allows to repair all failures. > ... [more about pytz and dateutil] .;.. Please don't take offense at my chopping this. It's only that I need to make _some_ time tonight for 495 ;-) From tim.peters at gmail.com Fri Aug 28 08:01:06 2015 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 28 Aug 2015 01:01:06 -0500 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: [Stuart Bishop ] > ... [on timeline arithmetic] ... > I'm wondering if it is worth formalizing this (post-PEP-495,or maybe > some choice wording changes made in the docs). Would it work if we > introduced a new type, datetimetz? We would have a time, with a tzinfo > because it might be useful later, a naive time, with a tzinfo because > it is useful for rendering and conversions, and a datetimetz with all > the complexities and slowdowns of timeline arithmetic. While not > changing the behaviour of datetime at all, we could get cats and dogs > living together by just clarifying what it actually is. There was a lot of discussion of this before you arrived here, and even a PEP (500). At least Guido, Alex and I agreed it would be better for the tzinfo object to decide which kind of arithmetic to use. For example, if you're right that billions (nay, trillions!) of programmers will eventually suffer irreparable emotional harm from learning how classic arithmetic works, they'll want to convert their code immediately, before their innocent children suffer clinical depression too. Because datetimes are typically created all over the place, but programs typically have only a few places where a tzinfo is obtained from some factory functions, it should be much easier to just change the latter call sites. So, e.g., get one tzinfo that says "timeline arithmetic!" in some way, and _all_ datetimes using it obey God's Way To Do It. The first question then is "how does a tzinfo spell that?". PEP 500 proposed adding optional new magic methods to tzinfos, so they could implement whatever damn fool arithmetic they liked. datetime internals would only change to see whether a tzinfo supplied such-&-such a method, and delegate arithmetic to it if so. 1. For timeline arithmetic, a tzinfo subclass could supply methods for the 3 kinds of arithmetic (datetime - datetime,, and datetime +/- timedelta), with bodies akin to the simple one-liner I showed before for datetime + timedelta. 2. People who wanted leap seconds (to account for real-world durations between two civil times) could similarly supply _that_, via even slower arithmetic. 3. And, e.g., people who wanted to view timedeltas as representing durations in Mars seconds could convert to Earth seconds under the covers. That's Alex's primary use case. So, quite general, and little impact on the core. Guido rejected it ;-) The other idea was building timeline arithmetic into the core datetime implementation, and use it if and only if a tzinfo had a magic new attribute, or inherited from a magic new marker class. Not generalizable beyond _just_ that case, heavier impact on the core, and so far nobody has cared enough to write a PEP. The second question is whether _anything_ should be done in this direction. I was +0.83 on PEP 500 at first, but -0.51 on anything now. Alex can move to Mars if he loves Mars time so much, while I don't really want Python to enable poor practice in the #1 and #2 cases. UTC is perfectly adequate for those who need timeline arithmetic, and that was the _intent_ from the start (although I don't recall the docs saying so) - and using UTC for this purpose is also universally recognized as best practice. If someone is determined to be foolish, fine, let 'em use an explicit function. > ... > If our underlying platforms that we needed to work with supported it, > I'd probably be in favour of leap seconds. I doubt that would ever > happen - there are more palatable workarounds. People who need it really need it - but they should be working in TAI. In Python, if they work in UTC - or even in naive datetime - it's quite possible to write leap-second-aware functions to do what they want. Intriguingly, TAI is nearly identical to Python's "naive time". So stick that in your pipe and smoke it: the people responsible for building the most sophisticated clocks on Earth _live_ in naive time. It's the most sophisticated notion of time yet known ;-) OTOH, for people who don't need it, accounting for leap seconds would be a mistake: best I can tell, every programming language on the planet with any kind of date-and-time support follows the POSIX-approximation-to-UTC model now. So if your arithmetic accounts for leap seconds, it won't agree with anyone else's in the computer world. > ... > I think in my view, as soon as you go to the bother of adding a tzinfo > instance to the datetime you are making a statement about the expected > behaviour; that the simpler classic arithmetic no longer applies and > the more complex model needs to be used. I had already guessed that ;-) It's just a dozen years too late to influence datetime's design. >> ... >> There you go: "timeline" datetime + timedelta arithmetic about as >> efficiently as possible in pure Python. > ... > What I don't like about this approach is the developers need to be > aware that they need to call it, Is that really worse than needing to call .normalize() after every arithmetic operation, with - I bet - most not being really clear on _why_ they need to? > and that dt + timedelta(hours=24) may not work. Adding functions for timeline arithmetic can't possibly change what classic arithmetic does. For me, adding timedelta(hours=24) always does exactly what I intend it to do. But, yes, people will forget the distinction sometimes. But easy solution: do what they _should_ have done from the start: work in UTC instead, and have no problems, surprises, missing magical invocations, or confusions of any kind ever. > Of course, developers will not be aware or have done more > than skim the docs until after their guests have all died of > salmonella poisoning from the undercooked Turkey. Not a problem. My turkey party occurs at the _end_ of DST. "Same time next day" would keep the turkey in the smoker for 25 hours, not 23. No salmonella: you're obviously determined to spread groundless turkey FUD ;-) >> ... >> My hope was that 495 alone would at least spare pytz's users from >> needing to do a `.normalize()` dance after `.astimezone()` anymore. >> Although I'm not clear on why it's needed even now. > Instead of one tzinfo instance, there are dozens for your timezone. > The datetime implementation does not give pytz the opportunity to > choose which one is used when constructing the datetime, so localize > is needed to sort that. Similarly, arithmetic does not always give > pytz the opportunity to choose which one is used after crossing a > timezone boundary, so normalize is needed to sort that out. While the > results of the timeline arithmetic are unambiguous and obvious, they > are arguably incorrect until normalize puts things right. This is .astimezone(), though - no constructor and no (visible) arithmetic here. It's returning something via fromutc(), and I presume pytz has its own .fromutc() implementation. > ... > I think I'm after hooks to replace localize on construction and > normalize after arithmetic, so users don't have to be relied on to do > this explicitly. This doesn't need to happen now, and I fully > understand this could be considered fast path and the overhead > unacceptable. If you're determined to supply by-magic timeline arithmetic, then I strongly suggest looking at the ideas at the top of this message, and push for a _real_ change to Python. That is, instead of pushing for hooks wholly specific to pytz, push for a change that will allow anyone to implement timeline arithmetic in a straightforward way, using non-magical "hybrid" tzinfo classes. But that's not my itch, and - indeed - I'd prefer Python left well enough alone after 495 allows repairing the fundamental problem with conversions. > ... > I think all the data we have access to, including from platform C > library functions, uses the is_dst flag or is simpler to map to the > is_dst flag. I need a complete use case, start to finish, to make sense of what you're talking about here. In particular, you never mention any datetime or pytz operations when talking about is_dst. So I still have no idea why it's being discussed at all. > The C library as exposed by the time.struct_time gives you is_dst. See other msgs today. mktime() is unreliable. Even if it was reliable, what of it? Why do you _want_ is_dst? There's no use case here that consumes it. > Mapping that to first/fold means first doing doing two conversions and > determining which one comes first. Ditto. I have no idea what use case you have in mind that would _require_ mapping is_dst to fold. Inside pytz, you have an exhaustive list of all transitions, thanks to zoneinfo. pytz internals don't need any flaky C library functions to determine anything about transitions. > Similarly, when loading your JSON file or examining email headers you > need to load in a string like '2004-04-04 02:30:00 EDT-05:00'. Its > simple to use a lookup table to map the abbreviation + offset to an > is_dst flag. As above. > Its harder to map it to first/fold because they are > swapped around in April and October. And there can be more than two > transitions in a year, so if you need to support that your going to > need to do the lookup, construct a couple of instances, and compare to > work out if EDT or EST comes first that month in that year. Inside pytz you already know everything that can be known about transitions. You don't "poke and hope" to do that, you do a binary search, right? You find the zoneinfo record for the time of interest, and compare that to the transitions on either side to deduce whether there's a fold or gap in play. Although I bet this could be sped up by doing some precomputation when loading a tzfile to begin with. > But, really, I hate all the options for the flag name. I lean towards > is_dst mainly because people are used to it. I'm burned out on name bikeshedding - but `is_dst` makes no sense unless the flag is at least pretending to say something about whether DST is in effect. That's not enough. For example, the zoneinfo source notes that there's a place in Antarctica that has two different kinds of DST each year. It's so bizarre that zic (the zoneinfo compiler) had to be changed to handle it, and they've left the rules commented out until the new zic is more widely adopted. When they uncomment the rules, is_dst will tell you nothing about _which_ kind of DST is in effect (the offset+1 flavor, or the offset+2 flavor).. "fold" makes perfectly clear sense for transitions due to any cause whatsoever. The only advantage to is_dst is that it's so poorly defined for edge cases that no two mktime() implementations can be expected to agree :-( >> But there's every reason to be optimistic: even someone as old and >> in-the-way as me doesn't find any of this particularly confusing ;-) > I may be old, but at least I'm not as old as Tim ;) Ain't that the truth :-( From tim.peters at gmail.com Fri Aug 28 18:12:10 2015 From: tim.peters at gmail.com (Tim Peters) Date: Fri, 28 Aug 2015 11:12:10 -0500 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: Message-ID: [Tim] > ... > Inside pytz you already know everything that can be known about > transitions. You don't "poke and hope" to do that, you do a binary > search, right? You find the zoneinfo record for the time of interest, > and compare that to the transitions on either side to deduce whether > there's a fold or gap in play. I should try to flesh that out. I don't work with tzfiles (I'm on Windows - nobody gives a shit about history on Windows ;-) ). My understanding is that you have a sorted listed of transition times, in UTC. I'll call that list `tt` (transition times). Also a parallel list of (at least) total UTC offsets. I'll call that list `ofs`. `fold` should be 1 if and only if the UTC time `u` is the later of times ambiguous in the local zone. Ambiguous times exist if and only if a transition decreases the total UTC offset. There are some before the transition ("at the end of its life"), and some after the transition ("at the start of its life"). fold is 1 only in the latter case (the later of ambiguous times). So first do a binary seach on `tt` to find the largest transition time <= u. Call the index `i`. You must already have code to do that, yes? Then: if i == 0: # First entry in the file - there is no # transition _to_ this. return 0 # How much did the total UTC offset change? delta = ofs[i] - ofs[i-1] if delta >= 0: # The offset didn't change, or the # offset increased so this a gap (not # fold) case. return 0 # delta < 0, so the offset decreased. All # and only the times from tt[i] up to (but # not including) tt[i] - delta ("-" because # delta is negative) are the later of # ambiguous times. return int(u - tt[i] < -delta) # or int(tt[i] - u > delta) to save a "-" # at the cost of making it incomprehensible All of that may be wrong ;-) The _point_ is that the base ideas are so straightforward that if the code works in practice for some non-trivial case, it probably works for almost all cases. I don't know enough about tzfiles to be sure of anything. In particular, a `delta` of 0 _may_ mean there's some kind of more-or-less artificial (having nothing to do with offsets) entry in the file, and it's necessary to search back for the closest preceding offset that's _not_ the same as ofs[i]. For example, maybe just the name of the zone changed? If so, then runtime search could be avoided by pre-computing such stuff when loading the tzfile to begin with. For that matter, a parallel `delta` list could be precomputed once-&-for-all. This _could_ be made to run very fast. Getting the right answer is minor compared to running fast ;-) Then compare those 6 lines of code to the incredible gyrations any implementation of mktime() goes through to get is_dst right (if you can find a mktime() implementation that always does get it right). From ethan at stoneleaf.us Fri Aug 28 19:55:29 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 28 Aug 2015 10:55:29 -0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: <55DF4239.2090404@oddbird.net> Message-ID: <55E0A091.3010009@stoneleaf.us> On 08/27/2015 10:10 AM, Alexander Belopolsky wrote: > On Thu, Aug 27, 2015 at 1:00 PM, Carl Meyer wrote: >> >> I spent some time in rural southern Mexico, where people referred to DST >> as "government time" and non-DST as "God's time." > > Interesting. In Russia it was sometimes called the "decree" time even though > historically the "decree time" was something different. (And guess what: the > spring-forward/fall-back mnemonics don't help non-English speakers.) It only barely helps me (a native English speaker living in Pacific Time) -- it tells me which way the clock is going to go, but I am still clueless as to which is Daylight Savings and which is Daylight Standard. -- ~Ethan~ From alexander.belopolsky at gmail.com Fri Aug 28 20:01:23 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 28 Aug 2015 14:01:23 -0400 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: <55E0A091.3010009@stoneleaf.us> References: <55DF4239.2090404@oddbird.net> <55E0A091.3010009@stoneleaf.us> Message-ID: On Fri, Aug 28, 2015 at 1:55 PM, Ethan Furman wrote: > I am still clueless as to which is Daylight Savings and which is Daylight > Standard. .. or how to spell Daylight-Saving Time. :-) (Sorry - could not resist - I am a far worse speller than you are!) -------------- next part -------------- An HTML attachment was scrubbed... URL: From ethan at stoneleaf.us Fri Aug 28 20:16:52 2015 From: ethan at stoneleaf.us (Ethan Furman) Date: Fri, 28 Aug 2015 11:16:52 -0700 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: References: <55DF4239.2090404@oddbird.net> <55E0A091.3010009@stoneleaf.us> Message-ID: <55E0A594.50401@stoneleaf.us> On 08/28/2015 11:01 AM, Alexander Belopolsky wrote: > On Fri, Aug 28, 2015 at 1:55 PM, Ethan Furman wrote: >> >> I am still clueless as to which is Daylight Savings and which is Daylight Standard. > > .. or how to spell Daylight-Saving Time. :-) (Sorry - could not resist - I am a far > worse speller than you are!) Heh. :) -- ~Ethan~ From guido at python.org Sat Aug 29 00:53:40 2015 From: guido at python.org (Guido van Rossum) Date: Fri, 28 Aug 2015 15:53:40 -0700 Subject: [Datetime-SIG] A local timezone class Message-ID: I've written a simple class that implements a local timezone tzinfo object, deferring to what the time module exposes about the local zone. https://gist.github.com/gvanrossum/ef201fe313719305c4c7 There are two variations: one for systems that support tm_gmtoff and tm_zone, one for systems without those. Output of the test program: BetterLocalTimeZone Fri Aug 28 15:50:22 2015 PDT (-0700) Wed Feb 24 15:50:22 2016 PST (-0800) I'm not entirely sure why we didn't add this to the stdlib ages ago. (Maybe out of a sense of perfectionism, since time.localtime() may be wrong for dates in the past or future where different DST rules or a different standard offset apply? But why would we care, if we're fine with the time module's behavior?) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sat Aug 29 01:29:58 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 28 Aug 2015 19:29:58 -0400 Subject: [Datetime-SIG] PEP 495: Conversion from naive to aware Message-ID: While PEP 495 is primarily about improving the tzinfo interface, it proposes one feature that will be visible to datetime users who don't write their own tzinfo implementations or use any from a third party. This feature is described in the section "Conversion from Naive to Aware" [1] and will enable users to call astimezone() method on naive instances to convert them to aware. The PEP proposes that naive datetimes should be presumed by astimezone() to be in system local timezone similarly to the way they are treated in the timestamp() method now. This feature has been criticized because "Treating a naive datetime as being a local time is as senseless as treating it as a UTC time - it's arbitrary." [2] The proposed changes to astimezone() will make it discriminate between the two arbitrary choices. Converting a naive datetime to say UTC timezone assuming it is local will become >>> dt.astimezone(timezone.utc) but the same conversion assuming dt is already in UTC is done by >>> dt.replace(tzinfo=timezone.utc) The asymmetry is more visible in when the task is to imbue a naive datetime with an appropriate local fixed offset timezone: >>> dt.astimezone() # assuming dt is local vs. >>> dt.replace(tzinfo=timezone.utc).astimezone() # assuming dt is UTC As an alternative to an implicit local timezone, we can provide an explicit timezone.local implementation and tell users to write >>> dt.replace(tzinfo=timezone.local).astimezone() # assuming dt is local making working with implicitly local and implicitly UTC naive instances equally inconvenient. Furthermore, implicit local time zone logic is already implemented with respect to the target timezone: to convert to a local (fixed offset) timezone astimezone() is called without arguments, but any other timezone needs to be provided explicitly, including timezone.utc. IMO, if nondiscrimination between between UTC and local is important, I would rather add utcastimezone() and utctimestamp() methods that will work like their namesakes but assume that the instances they are invoked on are in UTC. Please discuss. [1]: https://www.python.org/dev/peps/pep-0495/#conversion-from-naive-to-aware [2]: https://mail.python.org/pipermail/datetime-sig/2015-August/000464.html -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sat Aug 29 01:37:24 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 28 Aug 2015 19:37:24 -0400 Subject: [Datetime-SIG] A local timezone class In-Reply-To: References: Message-ID: On Fri, Aug 28, 2015 at 6:53 PM, Guido van Rossum wrote: > I'm not entirely sure why we didn't add this to the stdlib ages ago. Maybe because tm_gmtoff was not added to time.struct_time until Python 3.3 [1] and was not available from the most of the C libraries at the time? [1]: https://docs.python.org/3/library/time.html#time.struct_time -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Aug 29 01:51:38 2015 From: guido at python.org (Guido van Rossum) Date: Fri, 28 Aug 2015 16:51:38 -0700 Subject: [Datetime-SIG] A local timezone class In-Reply-To: References: Message-ID: On Fri, Aug 28, 2015 at 4:37 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > On Fri, Aug 28, 2015 at 6:53 PM, Guido van Rossum > wrote: > >> I'm not entirely sure why we didn't add this to the stdlib ages ago. > > > Maybe because tm_gmtoff was not added to time.struct_time until Python 3.3 > [1] and was not available from the most of the C libraries at the time? > > [1]: https://docs.python.org/3/library/time.html#time.struct_time > But (as my base class shows) implementing the required API using only the tm_isdst flag and the timezone-related attributes of the time module (tzname, timezone, altzone) a satisfactory implementation can be obtained, and that API has been stable since the beginning of time. (Well, at least since 1993: https://hg.python.org/cpython-fullhistory/rev/6ee380349c84 .) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sat Aug 29 01:58:21 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 28 Aug 2015 19:58:21 -0400 Subject: [Datetime-SIG] A local timezone class In-Reply-To: References: Message-ID: On Fri, Aug 28, 2015 at 7:51 PM, Guido van Rossum wrote: > On Fri, Aug 28, 2015 at 6:53 PM, Guido van Rossum >> wrote: >> >>> I'm not entirely sure why we didn't add this to the stdlib ages ago. >> >> >> > But (as my base class shows) implementing the required API using only the > tm_isdst flag and the timezone-related attributes of the time module > (tzname, timezone, altzone) a satisfactory implementation can be obtained, > and that API has been stable since the beginning of time. > To be fair, we did ship something very similar, but it was only accessible to users who read the library manual: < https://hg.python.org/cpython/file/v3.5.0rc2/Doc/includes/tzinfo-examples.py#l54>. :-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Aug 29 02:06:10 2015 From: guido at python.org (Guido van Rossum) Date: Fri, 28 Aug 2015 17:06:10 -0700 Subject: [Datetime-SIG] A local timezone class In-Reply-To: References: Message-ID: On Fri, Aug 28, 2015 at 4:58 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > On Fri, Aug 28, 2015 at 7:51 PM, Guido van Rossum > wrote: > >> On Fri, Aug 28, 2015 at 6:53 PM, Guido van Rossum >>> wrote: >>> >>>> I'm not entirely sure why we didn't add this to the stdlib ages ago. >>> >>> >> But (as my base class shows) implementing the required API using only the >> tm_isdst flag and the timezone-related attributes of the time module >> (tzname, timezone, altzone) a satisfactory implementation can be obtained, >> and that API has been stable since the beginning of time. >> > > To be fair, we did ship something very similar, but it was only accessible > to users who read the library manual: < > https://hg.python.org/cpython/file/v3.5.0rc2/Doc/includes/tzinfo-examples.py#l54>. > :-) > So, again, why have we been shy of adding this to the stdlib? We did (eventually) add the fixed offset classes. (And before you say "for the same reason we didn't add the USTimeZone class", the reasons can't be the same -- the latter would require maintenance whenever the US changes its DST rules, while LocalTime is oblivious to all that.) -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sat Aug 29 02:26:56 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 28 Aug 2015 20:26:56 -0400 Subject: [Datetime-SIG] A local timezone class In-Reply-To: References: Message-ID: On Fri, Aug 28, 2015 at 8:06 PM, Guido van Rossum wrote: > .. > So, again, why have we been shy of adding this to the stdlib? We did > (eventually) add the fixed offset classes. > I may be wrong, but I thought it was due to unresolvable DST fold issues that are mooted by PEP 495. (At least that was the reason I did not push for LocalTimezone as hard as I did for the fixed offset class.) I am definitely +1 on having LocalTimezone (or timezone.local) in the datetime module, but there is one reason why I would rather see it as a part of a larger Olson TZ database-size offering. If we just add LocalTimezone, pickling a datetime instance on one system and reading it on another with a different TZ will result in changing the time value. On the other hand, if we have a complete TZ database, then LocalTimezone can simply become an alias for Zoneinfo(os.environ['TZ']) and there will be no problem sharing aware datetime objects between systems that use the same version of the database. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sat Aug 29 02:45:59 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Fri, 28 Aug 2015 20:45:59 -0400 Subject: [Datetime-SIG] A local timezone class In-Reply-To: References: Message-ID: On Fri, Aug 28, 2015 at 8:26 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > On Fri, Aug 28, 2015 at 8:06 PM, Guido van Rossum > wrote: > >> .. >> So, again, why have we been shy of adding this to the stdlib? We did >> (eventually) add the fixed offset classes. >> > > I may be wrong, but I thought it was due to unresolvable DST fold issues > that are mooted by PEP 495. (At least that was the reason I did not push > for LocalTimezone as hard as I did for the fixed offset class.) > See . -------------- next part -------------- An HTML attachment was scrubbed... URL: From guido at python.org Sat Aug 29 02:47:48 2015 From: guido at python.org (Guido van Rossum) Date: Fri, 28 Aug 2015 17:47:48 -0700 Subject: [Datetime-SIG] A local timezone class In-Reply-To: References: Message-ID: On Fri, Aug 28, 2015 at 5:26 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > > On Fri, Aug 28, 2015 at 8:06 PM, Guido van Rossum > wrote: > >> .. >> So, again, why have we been shy of adding this to the stdlib? We did >> (eventually) add the fixed offset classes. >> > > I may be wrong, but I thought it was due to unresolvable DST fold issues > that are mooted by PEP 495. (At least that was the reason I did not push > for LocalTimezone as hard as I did for the fixed offset class.) > Again, that sounds like perfection being the enemy of the good. :-( Those issues are resolved by the demo class however mktime() resolves them when tm_isdst is set to -1. Outside the fold it should work fine, and in the fold it should work as well as can be expected. :-) I am definitely +1 on having LocalTimezone (or timezone.local) in the > datetime module, but there is one reason why I would rather see it as a > part of a larger Olson TZ database-size offering. > > If we just add LocalTimezone, pickling a datetime instance on one system > and reading it on another with a different TZ will result in changing the > time value. > OK, that's definitely a concern. > On the other hand, if we have a complete TZ database, then LocalTimezone > can simply become an alias for Zoneinfo(os.environ['TZ']) and there will be > no problem sharing aware datetime objects between systems that use the same > version of the database. > (But can a TZ database ever be considered complete? :-) I think the best way out of this will be for there to be two different local timezone objects: one that always defers to what the time module reveals about local time, another that is always tied to the local zone as it is recorded in the tz database. The two may differ even if they agree about the tz name; the tz database may have historical details that are missing from the implementation consulted by mktime() and localtime() (e.g. on Windows, or using certain encodings of the TZ environment variable on certain systems). -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Sat Aug 29 20:50:33 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 29 Aug 2015 13:50:33 -0500 Subject: [Datetime-SIG] Trivial vs easy: .utcoffset() Message-ID: Timezone conversion is mathematically trivial, but that doesn't mean it's obvious or easy. Details can really bite. tzinfo supplies .utcoffset(), which made converting _to_ UTC dead obvious. But how to convert _from_ UTC remained clear as mud. The default .fromutc() "gets it right" (as far as is possible without a fold/is_dst flag), but _only_ handles DST transitions that strictly alternate between "on" (although with a DST adjustment that may change each time) and "off" (a DST adjustment of exactly 0). Nothing fancier than that; e.g., no base offset changes. It was amazingly annoying to craft an efficient, correct (so far as it goes) implementation of just that much. Even then, hand-written tzinfo implementations had to express DST transition points in "standard time" for it to always work, instead of in natural local wall-clock times (end-of-DST is where that makes a difference). As Alex noted elsewhere, unlike the hand-written .utcoffset() implementations shown in the Python docs, most timezone sources (chiefly Olson - zoneinfo) effectively supply a .fromutc() implementation instead. Which makes converting from UTC dead obvious, but - surprise ;-) - leaves how to convert _to_ UTC (how to implement .utcoffset()) clear as mud instead. In a zoneinfo world, referring back to Guido's diagram a local datetime is staring at a chart with _no_ visible diagonal lines when looking right from the Y (local) axis; they're only visible when looking up from the X (UTC) axis. The hand-written tzinfo classes in the Python docs had the opposite problem, but implicitly left it to the default .fromutc() to figure out the invisible part so "the problem" isn't apparent in the docs. Stewart noted before that always using fixed-offset classes in pytz effectively supplies the missing is_dst bit, but it does more than just that: it effectively stores the datetime's current UTC offset too. The transition charts a pytz tzinfo sees always have a single, continuous diagonal line, visible from both axes. Easy peasy. In return, any operation on the datetime object that creates a new datetime but just copies the original tzinfo into the result may end up with a tzinfo that's no longer correct (lying about the UTC offset that's _appropriate_ for the new date and time). Hence the need to call .normalize() all over the place. If .normalize() were applied magically instead by Python internals, that need would go away, but then timeline arithmetic is "the natural" result - it's unclear to me that classic arithmetic _could_ be implemented if the result of every relevant operation added a "convert to UTC and back again, to get the appropriate current UTC offset" step at the end (not to mention how much slower much code would become). So how can .utcoffset() be computed efficiently in a zoneinfo world using "hybrid" tzinfo classes (tzinfos that are smart enough to figure out the appropriate offset all on their own)? It's like re-inventing the default .fromutc() all over again, but in the other direction in a much lumpier world. Of course there are many ideas. Rather than drone on about them, I'd like to put the puzzle out there in case a correct "duh - it's obvious, you moron" reply is just waiting for an invitation - but do note the "correct" ;-) BTW, after 15 minutes I wasn't able to convince myself I understood what dateutil's zoneinfo-wrapping's .utcoffset() was doing; and if I don't understand what it's doing, there's no way I can guess whether it's always correct. One obvious idea for a zoneinfo exhaustive-list-of-transitions-in-UTC world: precompute another exhaustive list of transitions, but expressed in local time (including "fold") mapping to the correct UTC offset at each point. That could pretty obviously work, but is essentially a way of implementing "poke and hope" in a simple, uniform way (via binary search). There's also that exhaustive lists of transition points is a doomed approach over time. zoneinfo supplies them through 2037 for the benefit of legacy clients, but they expect modern clients to use a POSIX TZ rule (stored in version 2 tzfiles) too. pytz and dateutil both ship with version 2 (or maybe version 3) tzfiles, but neither goes beyond using the version 1 exhaustive-list portion of tzfiles. So more fun is waiting there ;-) From guido at python.org Sat Aug 29 20:57:37 2015 From: guido at python.org (Guido van Rossum) Date: Sat, 29 Aug 2015 11:57:37 -0700 Subject: [Datetime-SIG] PEP 495: Conversion from naive to aware In-Reply-To: References: Message-ID: OK, I've pondered this some more, and in the end I think it is a reasonable proposal. Note that there are many subtleties in various other methods already; e.g. the distinction between timetuple() and utctimetuple(), or between fromtimestamp() and utcfromtimestamp(), is not as straightforward as their names might suggest. So I think it's better not to add more utc*() functions. If I had to do the design over I might have made a deeper distinction between aware and naive datetimes (e.g. aware being a subclass of naive) but such a redesign is not on the table. Another (mostly unrelated) observation about datetime: even an aware datetime does not really represent an instant in time, unless its timezone has a fixed offset. For example, suppose I create a datetime representing noon, June 3rd, US/Eastern time in 2020. My best guess is that this will correspond to the instant 16:00 in UTC on that date. But I can't really tell. Maybe by then the campaign against DST has finally succeeded, and noon on that date in US/Eastern is actually 17:00 UTC. But if I store this date in a database and read it back in the year 2020, I want it to be read back as noon US/Eastern (June 3rd 2020), not as "whatever time in US/Eastern corresponding to 16:00 UTC on that date". Now you may wonder why I prefer it this way, and I'll just let you guess -- but I need this for my use case. If your use case needs to record the instant, you should use UTC or some other fixed offset zone. (And yes, I fundamentally consider pytz's behavior here a bug, and datetime's behavior a feature.) On Fri, Aug 28, 2015 at 4:29 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > While PEP 495 is primarily about improving the tzinfo interface, it > proposes one feature that will be visible to datetime users who don't write > their own tzinfo implementations or use any from a third party. This > feature is described in the section "Conversion from Naive to Aware" [1] > and will enable users to call astimezone() method on naive instances to > convert them to aware. > > The PEP proposes that naive datetimes should be presumed > by astimezone() to be in system local timezone similarly to the way they > are treated in the timestamp() method now. > > This feature has been criticized because "Treating a naive datetime as > being a local time is as senseless as treating it as a UTC time - it's > arbitrary." [2] > > The proposed changes to astimezone() will make it discriminate between the > two arbitrary choices. Converting a naive datetime to say UTC timezone > assuming it is local will become > > >>> dt.astimezone(timezone.utc) > > but the same conversion assuming dt is already in UTC is done by > > >>> dt.replace(tzinfo=timezone.utc) > > The asymmetry is more visible in when the task is to imbue a naive > datetime with an appropriate local fixed offset timezone: > > >>> dt.astimezone() # assuming dt is local > > vs. > > >>> dt.replace(tzinfo=timezone.utc).astimezone() # assuming dt is UTC > > As an alternative to an implicit local timezone, we can provide an > explicit timezone.local implementation and tell users to write > > >>> dt.replace(tzinfo=timezone.local).astimezone() # assuming dt is local > > making working with implicitly local and implicitly UTC naive instances > equally inconvenient. > > Furthermore, implicit local time zone logic is already implemented with > respect to the target timezone: to convert to a local (fixed offset) > timezone astimezone() is called without arguments, but any other timezone > needs to be provided explicitly, including timezone.utc. > > IMO, if nondiscrimination between between UTC and local is important, I > would rather add utcastimezone() and utctimestamp() methods that will work > like their namesakes but assume that the instances they are invoked on are > in UTC. > > Please discuss. > > > > > [1]: > https://www.python.org/dev/peps/pep-0495/#conversion-from-naive-to-aware > [2]: > https://mail.python.org/pipermail/datetime-sig/2015-August/000464.html > > _______________________________________________ > Datetime-SIG mailing list > Datetime-SIG at python.org > https://mail.python.org/mailman/listinfo/datetime-sig > The PSF Code of Conduct applies to this mailing list: > https://www.python.org/psf/codeofconduct/ > -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Sat Aug 29 22:18:15 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Sat, 29 Aug 2015 16:18:15 -0400 Subject: [Datetime-SIG] Trivial vs easy: .utcoffset() In-Reply-To: References: Message-ID: > On Aug 29, 2015, at 2:50 PM, Tim Peters wrote: > > So how can .utcoffset() be computed efficiently in a zoneinfo world > using "hybrid" tzinfo classes (tzinfos that are smart enough to figure > out the appropriate offset all on their own)? As I am learning more about Olson/IANA database, I am more and more convinced that Python approach is better than that of UNIX. The Python approach is to provide effectively local to UTC mapping via utcoffset() while UNIX approach is to provide UTC to local mapping via the localtime() function. Python then supplies fromutc() which is real simple in regular cases and I think implementable in a general case while UNIX supplies its mktime which is a poke six times and hope it is enough mess. The reason I think Python API is superior is because with exception of leap seconds, all transitions in Olson database are given in local time in the raw files. The raw files then get "compiled" so that localtime() can be implemented efficiently and Olson never supplies his own mktime as far as I can tell. A familiar example where DST rules are simpler when formulated in local time are the US rules. In local time, all three (or four?) US zones have exactly the same rule - fall-back at 2am on first Sunday in November and spring-forward at the 2am on the second Sunday in March. Expressed in UTC, the transitions will be all at a different hour and may not even happen on the same day. I think the future of TZ support in Python is to come up with some automatic way to translate from raw Olson files to utcoffset()/dst()/tzname() implementations and invent some clever fromutc() algorithm that will correctly "invert" y = x - x.utcoffset() in all cases. For the later task, I have a solution in my prototype branch, but it requires up to six calls to utcoffset() which may indeed be the best on can do and it is not a coincidence that the number of calls in the worst case is the same as the number of pokes in mktime. From tim.peters at gmail.com Sun Aug 30 03:49:58 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sat, 29 Aug 2015 20:49:58 -0500 Subject: [Datetime-SIG] Trivial vs easy: .utcoffset() In-Reply-To: References: Message-ID: [Alexander Belopolsky ] > As I am learning more about Olson/IANA database, I am > more and more convinced that Python approach is better > than that of UNIX. While I'm leaning more & more to the opposite conclusion ;-) That is, you can't fight crushing success. Like IEEE-754 was for binary floating point, zoneinfo is a "category killer". It seems very likely that no competing approach will ever attract enough interest to get anywhere, The number of people who truly care enough to even try can be counted on two middle fingers. Since the zoneinfo data so strongly favors UTC->local conversions, the only sane way to play along with it is to view .fromutc() as the primary tzinfo method and .utcoffset() as a possibly horridly slow afterthought. And then there's dateutil's wrapping. Amazingly enough, it inherits the default .fromutc(), despite that zoneinfo data makes that direction hard _not_ to get right in all cases. > The Python approach is to provide effectively local to UTC mapping > via utcoffset() while UNIX approach is to provide UTC to local mapping > via the localtime() function. I expect that's mostly because the UNIX tradition strongly favors setting the system clock to use UTC. UTC->local conversions may be needed countless times each day just in ordinary use by people who couldn't care less about timezones (except that they want to see their own local time). Windows solves that program by running the system clock _in_ local time. What could possibly go wrong? ;-) > Python then supplies fromutc() which is real simple in regular cases > and I think implementable in a general case Except "it sucks" when a system-supplied function doesn't handle all cases. I spent most of my career working for computer design companies. More than once, the HW guys and the bosses would come with questions like "oops! we missed a gate in the ALU, and sometimes addition may not propagate a carry from bit 12 into bit 13 - will that be a problem for you guys?". When HW product releases and millions of dollars are on the line, it's real tempting to say "hey, no problem - ship it! if they really care, they can cross-check their additions with an abacus:" ;-) > while UNIX supplies its mktime which is a poke six times and hope > it is enough mess. Which is my original puzzle: _given_ that the zoneinfo world apparently dioesn't care much about local->UTC conversions, is mktime the best that can be done? > The reason I think Python API is superior is because with exception > of leap seconds, all transitions in Olson database are given in local > time in the raw files. Well, they don't have to be in the plain text data files, that's just the default. And overwhelmingly most common. There are also ways to say "but this time is in the zone's 'standard time` regardless of is_dst", and "but this time is in UTC". I believe the _intent_ of all this is to specify the rules using whatever scheme the political authority announcing the rules used (to reduce errors, and to ease independent verification against source materials). > The raw files then get "compiled" so that localtime() can be implemented > efficiently Yes, the _explicit_ transition lists are all converted to POSIX timestamps (UTC seconds-from-the-epoch). But all the POSIX TZ rules generated in versions > 1 I've seen use the POSIX "local wall clock time" convention. > and Olson never supplies his own mktime as far as I can tell. This old implementation has his name on it: http://www.opensource.apple.com/source/ntp/ntp-13/ntp/libntp/mktime.c I kinda like it. It doesn't try hard to be clever. At heart, it does a binary search over all possible time_t values, calling localtime() on each until it finally manages to reproduce the input. But a comment notes that it failed at first in some cases because it didn't take is_dst into account. That was repaired by assuming DST transitions are all exactly one hour, and: /* * So, if you live somewhere in the world where dst is not 60 minutes offset, * and your vendor doesn't supply mktime(), you'll have to edit this variable * by hand. Sorry about that. */ Alas, I'm still capable of being embarrassed ;-) > A familiar example where DST rules are simpler when formulated > in local time are the US rules. In local time, all three (or four?) US > zones There are four major US zones. But people keep forgetting US places like Hawaii and Alaska and far east Maine (Atlantic Standard Time). I believe there are 9(!) "US" zones now. > have exactly the same rule - fall-back at 2am on first Sunday in November > and spring-forward at the 2am on the second Sunday in March. Expressed > in UTC, the transitions will be all at a different hour and may not even > happen on the same day. Well, the _writer_ of zoneinfo rules gets to use local times, so it's no problem for them. The UTC transition times in the binary tzfiles are indeed an irregular mess. But having an exhaustive list of transitions makes many tasks easy to code. zoneinfo seems determined to make UTC->local quick & reliable regardless of data-space burden. > I think the future of TZ support in Python is to come up with some > automatic way to translate from raw Olson files to > utcoffset()/dst()/tzname() implementations and invent some clever > fromutc() algorithm that will correctly "invert" y = x - x.utcoffset() in all cases. Which I'm afraid is backwards. since the overwhelmingly most important source of timezone data makes UTC->local easy, at least until 2038. That's why ".utcoffset()" is in the Subject line: zoneinfo hands us .fromutc() on a silver platter. It's .utcoffset() that's the puzzle now. After zoneinfo is wrapped in the post-495 world, it's possible nobody will ever write a tzinfo again :-) > For the later task, I have a solution in my prototype branch, but it > requires up to six calls to utcoffset() which may indeed be the best > on can do and it is not a coincidence that the number of calls in the > worst case is the same as the number of pokes in mktime. In which case it's about equally messy either way, yes? I haven't stared at mktime() in anger. Is there anything it could be _told_ about the local zone that could ease its job? For example, being told in advance the largest possible difference between adjacent UTC offsets? The smallest granularity of differences between adjacent UTC offsets? A list of all possible deltas between adjacent UTC offsets? Anything along those lines. tzfiles don't answer those questions directly, but it's easy to compute things like that while loading the file. From tim.peters at gmail.com Sun Aug 30 08:33:14 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 30 Aug 2015 01:33:14 -0500 Subject: [Datetime-SIG] Trivial vs easy: .utcoffset() In-Reply-To: References: Message-ID: [Alex] >> The reason I think Python API is superior is because with exception >> of leap seconds, all transitions in Olson database are given in local >> time in the raw files. [Tim] > Well, they don't have to be in the plain text data files, that's just > the default. And overwhelmingly most common. There are also ways to > say "but this time is in the zone's 'standard time` regardless of > is_dst", and "but this time is in UTC". ... Here's a list of all the string values appearing in Rule-line "AT" columns, each followed by the number of times it appears. The ones ending with "s" mean "standard time, not wall-clock time", and the ones ending with "u" mean "UTC, not wall-clock time". All others are the default wall-clock time (which can be spelled with an explicit "w" suffix, but never is): 0 1 0:00 755 0:00s 74 0:01 12 1:00 61 1:00s 40 1:00u 14 2:00 389 2:00s 277 2:30 7 2:45s 9 3:00 51 3:00s 3 3:00u 16 3:30 26 4:00 4 4:00u 12 5:00 2 9:00 1 11:00 1 12:00 2 22:00 1 22:00s 10 23:00 41 23:00s 105 23:00u 7 23:30 1 23:30s 1 24:00 20 I was surprised to see often "s" gets used. For example, quite often in France rules before 1940 (in the "europe" file). You can even ask me what the difference is between "0:00 tomorrow" and "24:00 today" ;-) There really is a reason for that. Some crazy scheme switching at the end of the last Thursday of some month. zoneinfo has a way to spell, e.g., "the last Thursday of March", but not a way to spell "the day after the last Thursday of March", and "the day after the last Thursday of March" isn't always "the last Friday of March". So you spell the switch as "24:00 on the last Thursday of March". Why on Earth would anyone switch DST during the work week? That one is tougher. If the rules are changed every year, people get annoyed. But if you never change the rules, then eventually DST will switch during some important stretch of the Islamic calendar (which is lunar, not solar, and has years about 11 days shorter than solar years). So the scheme above was apparently calculated to ensure Egypt wouldn't switch DST near the start of Ramadan for at least a decade. But then they'll have to change the rules. But not really, because they already changed them again ;-) Note that it can't work to define DST in regular rules related to the Islamic calendar instead, because "the reason" for DST is to adjust to physical realities related to the solar calendar. Then again, looks like the Egyptians are catching on to that there's no real "reason" to mess with DST at all: http://www.timeanddate.com/news/time/egypt-cancels-dst-2015.html What I remain unclear about is why anyone would imagine any of this is Python's problem ;-) From tim.peters at gmail.com Sun Aug 30 22:50:11 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 30 Aug 2015 15:50:11 -0500 Subject: [Datetime-SIG] Trivial vs easy: .utcoffset() In-Reply-To: References: Message-ID: [Tim[ > I haven't stared at mktime() in anger. Is there anything it could be > _told_ about the local zone that could ease its job? ... Easy idea: after loading a tzfile, create and store (on the tzinfo instance) a list of every unique total UTC offset in the zone's history, ordered by most recent to least. .utcoffset() then only needs to march through that list once, seeing whether the input minus the current offset (from the list) converts back to the input via .fromutc(). If none do, that's an internal error. That should usually get out in one or two tries (the zone's most recent "standard" and "daylight" total offsets, in some order, will usually be tried first). For historical dates, who cares - they're only going to appear in test cases anyway, and even then can't require .more .fromutc() calls than there are unique UTC offsets in the zone's history. For zones entirely defined by a POSIX TZ rule, the list would contain at most two entries. Nailed it ;-) Pickles are always a puzzle. The list of unique offsets may well change in the future, so it should really be recomputed from scratch when unpickling. From tim.peters at gmail.com Mon Aug 31 00:19:40 2015 From: tim.peters at gmail.com (Tim Peters) Date: Sun, 30 Aug 2015 17:19:40 -0500 Subject: [Datetime-SIG] Trivial vs easy: .utcoffset() In-Reply-To: References: Message-ID: [Tim] > Easy idea: after loading a tzfile, create and store (on the tzinfo > instance) a list of every unique total UTC offset in the zone's > history, ordered by most recent to least. .utcoffset() then only > needs to march through that list once, seeing whether the input minus > the current offset (from the list) converts back to the input via > .fromutc(). If none do, that's an internal error. Not quite. "Converts back" has to include reproducing the `fold` value too. If a user gives a datetime with fold=1 where it doesn't make sense,.fromutc() can't reproduce it. Likewise if a user gives a datetime in a gap. With enough extra annoying code, useful error messages for those could be produced. I wouldn't want to _guess_ the intent in either of those cases. It's an internal error if neither of those apply. From 4kir4.1i at gmail.com Mon Aug 31 11:56:03 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Mon, 31 Aug 2015 12:56:03 +0300 Subject: [Datetime-SIG] A local timezone class In-Reply-To: (Guido van Rossum's message of "Fri, 28 Aug 2015 15:53:40 -0700") References: Message-ID: <87io7v6ato.fsf@gmail.com> Guido van Rossum writes: > I've written a simple class that implements a local timezone tzinfo object, > deferring to what the time module exposes about the local zone. > > https://gist.github.com/gvanrossum/ef201fe313719305c4c7 > > There are two variations: one for systems that support tm_gmtoff and > tm_zone, one for systems without those. > > Output of the test program: > > BetterLocalTimeZone > Fri Aug 28 15:50:22 2015 PDT (-0700) > Wed Feb 24 15:50:22 2016 PST (-0800) > > I'm not entirely sure why we didn't add this to the stdlib ages ago. (Maybe > out of a sense of perfectionism, since time.localtime() may be wrong for > dates in the past or future where different DST rules or a different > standard offset apply? But why would we care, if we're fine with the time > module's behavior?) If tm_gmtoff, tm_zone attributes are not available then time module does not provide a way to get correct tzname [1]. In particular, datetime.astimezone() may fail to return a correct tzname due to that [2] [1] http://bugs.python.org/issue22798 [2] https://mail.python.org/pipermail/datetime-sig/2015-August/000471.html From 4kir4.1i at gmail.com Mon Aug 31 12:42:16 2015 From: 4kir4.1i at gmail.com (Akira Li) Date: Mon, 31 Aug 2015 13:42:16 +0300 Subject: [Datetime-SIG] PEP-431/495 In-Reply-To: (Tim Peters's message of "Thu, 27 Aug 2015 22:18:20 -0500") References: <87zj1d7usy.fsf@gmail.com> <87fv356k5i.fsf@gmail.com> Message-ID: <87h9nf68on.fsf@gmail.com> Tim Peters writes: > [Akira Li <4kir4.1i at gmail.com>] > ... >> then arithmetic "as though via conversion to utc time" is *equally valid* as >> the arithmetic "as though it is a timezone-naive datetime object".aa > > __add__ can only mean one of them. And it already does. It _could_ > have meant the other, but it doesn't. > In fact, it does mean the other: https://mail.python.org/pipermail/datetime-sig/2015-August/000545.html stdlib is consistent with pytz (utc arithmetic). dateutil (naive) produces different results. ... >> Timezones is a very complicated >> topic -- no solution works in the general case.a > > ? Timezone transitions are mathematically trivial. They're just full > of lumps (irregularities). > Irregularity is another word for complexity. Timezone rules are defined by politicians. Even if they appear to be simpletons; it is still hard to create a math equation that will predict the rules. Instead, the *database* approach with regular updates works well in practice. From guido at python.org Mon Aug 31 16:55:08 2015 From: guido at python.org (Guido van Rossum) Date: Mon, 31 Aug 2015 07:55:08 -0700 Subject: [Datetime-SIG] A local timezone class In-Reply-To: <87io7v6ato.fsf@gmail.com> References: <87io7v6ato.fsf@gmail.com> Message-ID: On Mon, Aug 31, 2015 at 2:56 AM, Akira Li <4kir4.1i at gmail.com> wrote: > Guido van Rossum writes: > > > I've written a simple class that implements a local timezone tzinfo > object, > > deferring to what the time module exposes about the local zone. > > > > https://gist.github.com/gvanrossum/ef201fe313719305c4c7 > > > > There are two variations: one for systems that support tm_gmtoff and > > tm_zone, one for systems without those. > > > > Output of the test program: > > > > BetterLocalTimeZone > > Fri Aug 28 15:50:22 2015 PDT (-0700) > > Wed Feb 24 15:50:22 2016 PST (-0800) > > > > I'm not entirely sure why we didn't add this to the stdlib ages ago. > (Maybe > > out of a sense of perfectionism, since time.localtime() may be wrong for > > dates in the past or future where different DST rules or a different > > standard offset apply? But why would we care, if we're fine with the time > > module's behavior?) > > If tm_gmtoff, tm_zone attributes are not available then time module does > not provide a way to get correct tzname [1]. In particular, > datetime.astimezone() may fail to return a correct tzname due to that [2] > > [1] http://bugs.python.org/issue22798 That bug report is incomprehensible. You have received feedback about this already. > > [2] > https://mail.python.org/pipermail/datetime-sig/2015-August/000471.html > So is this message. -- --Guido van Rossum (python.org/~guido) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Aug 31 19:58:12 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 31 Aug 2015 12:58:12 -0500 Subject: [Datetime-SIG] Another round on error-checking Message-ID: I've been playing with what it would take to wrap zoneinfo efficiently in a post-495 world. When I got to .utcoffset(), I just cringed when trying to implement the "in the face of ambiguity and/or impossibility, make stuff up ;-) ", parts. The pytz folks have been enthusiastic about pytz's approach. Alas, it's a poor fit to datetime's design, because pytz strives to make it appear that "naive time" doesn't exist at all for datetimes with a tzinfo. But in the design, they do. Regardless of whether a tzinfo is present, a datetime is intended to be viewed as working in naive time. "Missing" and "ambiguous" times plain don't exist in naive time, so it's unnatural to check for them all over the place. It's when a timezone-*specific* operation is attempted that the user is explicitly moving out of naive time (not merely when a tzinfo is attached). So, in my view, *that's* when to check. .utcoffset() is the primary such place (whether called directly or indirectly). At that point, two kinds of "meaningless" times pop into existence: 1. fold != 0 when the datetime isn't actually in a fold. 2. The datetime is in a gap. There is no UTC time that maps back to such cases, so there is no possible timedelta .utcoffset() can return that's wholly justifiable. PEP 495 specifies resolving such cases by magic, in essentially arbitrary (from the user's point of view) ways. This isn't for backward compatibility, because 495-compliant tzinfos don't currently exist(*). It's more that 495 gives users no other way to determine whether a datetime _is_ "a problem case" other than by calling .utcoffset() twice with different values for `fold`, and then making .utcoffset() return carefully chosen (but arbitrary from the user's POV) problem-case results sufficient to classify the datetime from the two .utcoffset() results. I think I'd rather acknowledge that problem cases exist in a direct and straightforward way, by adding a new tzinfo (say).classify() method. For example, .classify() could return a (kind, detail) 2-tuple. - kind==DTKIND_NORMAL. Not an exceptional case. detail is None. - kind==DTKIND_FOLD_NORMAL. The datetime is in a fold, and its `fold` value is sane. detail is the datetime's `fold` value (0 or 1). - kind==DTKIND_FOLD_INVALID. The datetime does not have `fold==0`, but the datetime is not in a fold. detail is the datetime's `fold` value (whatever it may be). - kind==DTKIND_GAP. The datetime is in a gap. detail is a (d1, d2) 2-tuple, where `d1` and `d2` are timedeltas such that (in classic arithmetic): datetime - d1 is the closest earlier non-gap time datetime + d2 is the closest later non-gap time Users can call that directly when they like. .utcoffset() (and other appropriate timezone-specific methods) would raise exceptions in the DTKIND_FOLD_INVALID and DTKIND_GAP cases, with the same exception detail as `classify()` returns. This would, of course, require major rewriting of the PEP. So Alex will hate it ;-) But, leaving aside how much design pain it would cause, is it "the right" (or "a righter") thing to do? That's what I'm more concerned about. In any case, since this is _a_ view of error checking that hasn't been mentioned at all before, it's worth putting it out in public. BTW, I don't expect pytz to like it. In Python's datetime design, timeline arithmetic should be done in UTC (or via timestamps) instead. The scheme above intends to catch errors _when_ converting to UTC, leaving naive time alone until (if ever) the user does explicitly invoke a timezone operation. (*) WRT backward compatibility, there are other non-obvious cases after 495 tzinfos do exist. LIke datetime.__hash__() calling .utcoffset(). It would be desirable that people living in naive time (despite attaching tzinfos) not need to worry about exceptions in cases like that when using a 495 tzinfo. In the kind of scheme above, one way around that is changing __hash__ (which could resolve problem cases in any way that works best for _its_ purposes). Another way is adding optional `check` Boolean arguments to various methods, defaulting to False, in which case the current 495 "make stuff up" results would be returned. But I'm trying to take a higher-level view of "what's right" in _this_ msg ;-) From alexander.belopolsky at gmail.com Mon Aug 31 20:26:15 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 31 Aug 2015 14:26:15 -0400 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: Message-ID: On Mon, Aug 31, 2015 at 1:58 PM, Tim Peters wrote: > PEP 495 specifies resolving such cases by magic, in essentially > arbitrary (from the user's point of view) ways. This isn't for > backward compatibility, because 495-compliant tzinfos don't currently > exist > .. but tzinfo.fromutc() [1] does exist and the first thing happening in that code is an unguarded call to utcoffset() on a UTC datetime with a transplanted tzinfo. [2] A trick like this have been valid in datetime since its introduction and has been used in numerous places since. You know that with a little effort universal .fromutc() can be rewritten to be PEP 495 compliant. If you think allowing datetime.utcoffset() to raise an exception is an option, let's first see how one can rewrite universal tzinfo.fromutc() for that design. Even though I don't think reusing a universal .fromutc() is a good option for tzinfo implementers, it gives an example of simple but non-trivial datetime manipulation code which PEP 495 is designed not to break. [1]: https://hg.python.org/cpython/file/v3.5.0rc2/Lib/datetime.py#l957 [2]: https://hg.python.org/cpython/file/v3.5.0rc2/Lib/datetime.py#l965 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Mon Aug 31 20:30:00 2015 From: carl at oddbird.net (Carl Meyer) Date: Mon, 31 Aug 2015 12:30:00 -0600 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: Message-ID: <55E49D28.8030204@oddbird.net> Hi Tim, On 08/31/2015 11:58 AM, Tim Peters wrote: > I've been playing with what it would take to wrap zoneinfo efficiently > in a post-495 world. When I got to .utcoffset(), I just cringed when > trying to implement the "in the face of ambiguity and/or > impossibility, make stuff up ;-) ", parts. > > The pytz folks have been enthusiastic about pytz's approach. Alas, > it's a poor fit to datetime's design, because pytz strives to make it > appear that "naive time" doesn't exist at all for datetimes with a > tzinfo. But in the design, they do. Regardless of whether a tzinfo > is present, a datetime is intended to be viewed as working in naive > time. "Missing" and "ambiguous" times plain don't exist in naive > time, so it's unnatural to check for them all over the place. > > It's when a timezone-*specific* operation is attempted that the user > is explicitly moving out of naive time (not merely when a tzinfo is > attached). So, in my view, *that's* when to check. .utcoffset() is > the primary such place (whether called directly or indirectly). That's pretty much what I proposed in the first invalid-time-checking thread. Alex didn't like it because `utcoffset()` is called from so many different places: https://mail.python.org/pipermail/datetime-sig/2015-August/000499.html AFAICT, you are re-proposing the same solution you characterized several times earlier as "spraying errors all over the place" and "going nowhere fast." :-) Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From alexander.belopolsky at gmail.com Mon Aug 31 20:33:59 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 31 Aug 2015 14:33:59 -0400 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: Message-ID: On Mon, Aug 31, 2015 at 1:58 PM, Tim Peters wrote: > I think I'd rather acknowledge that problem cases exist in a direct > and straightforward way, by adding a new tzinfo (say).classify() > method. For example, .classify() could return a > > (kind, detail) > > 2-tuple. > > - kind==DTKIND_NORMAL. > Not an exceptional case. > detail is None. > > - kind==DTKIND_FOLD_NORMAL. > The datetime is in a fold, and its `fold` value is sane. > detail is the datetime's `fold` value (0 or 1). > > - kind==DTKIND_FOLD_INVALID. > The datetime does not have `fold==0`, but the datetime is not in a fold. > detail is the datetime's `fold` value (whatever it may be). > > - kind==DTKIND_GAP. > The datetime is in a gap. > detail is a (d1, d2) 2-tuple, where `d1` and `d2` are > timedeltas such that (in classic arithmetic): > datetime - d1 is the closest earlier non-gap time > datetime + d2 is the closest later non-gap time > > Users can call that directly when they like. > I have no objection to this method as long as a default implementation in terms of a PEP 495 compliant .utcoffset() is provided in the basic datetime.tzinfo. Not a bad idea for another PEP. :-) -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Aug 31 20:34:10 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 31 Aug 2015 13:34:10 -0500 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: <55E49D28.8030204@oddbird.net> References: <55E49D28.8030204@oddbird.net> Message-ID: [Carl Meyer ] > That's pretty much what I proposed in the first invalid-time-checking > thread. Alex didn't like it because `utcoffset()` is called from so many > different places: > https://mail.python.org/pipermail/datetime-sig/2015-August/000499.html That is a potential problem. > AFAICT, you are re-proposing the same solution you characterized several > times earlier as "spraying errors all over the place" and "going nowhere > fast." :-) Nope. There's nothing here about, e.g., messing with datetime constructors, .replace(), .combine() ... "naive time" is left alone here. It's only timezone-specific operations targeted here, which are all implemented _by_ tzinfo objects. Not by datetime itself. From carl at oddbird.net Mon Aug 31 20:38:31 2015 From: carl at oddbird.net (Carl Meyer) Date: Mon, 31 Aug 2015 12:38:31 -0600 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: <55E49D28.8030204@oddbird.net> Message-ID: <55E49F27.9000906@oddbird.net> On 08/31/2015 12:34 PM, Tim Peters wrote: > [Carl Meyer ] >> That's pretty much what I proposed in the first invalid-time-checking >> thread. Alex didn't like it because `utcoffset()` is called from so many >> different places: >> https://mail.python.org/pipermail/datetime-sig/2015-August/000499.html > > That is a potential problem. > >> AFAICT, you are re-proposing the same solution you characterized several >> times earlier as "spraying errors all over the place" and "going nowhere >> fast." :-) > > Nope. There's nothing here about, e.g., messing with datetime > constructors, .replace(), .combine() ... "naive time" is left alone > here. It's only timezone-specific operations targeted here, which are > all implemented _by_ tzinfo objects. Not by datetime itself. There wasn't any of that stuff (messing with constructors, or replace, or combine, or naive time) in what Alex and I were discussing in the other thread, either. Just the idea of having `utcoffset()` raise an error if it hit an ambiguity. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From carl at oddbird.net Mon Aug 31 20:43:24 2015 From: carl at oddbird.net (Carl Meyer) Date: Mon, 31 Aug 2015 12:43:24 -0600 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: Message-ID: <55E4A04C.1050900@oddbird.net> On 08/31/2015 11:58 AM, Tim Peters wrote: > I think I'd rather acknowledge that problem cases exist in a direct > and straightforward way, by adding a new tzinfo (say).classify() > method. For example, .classify() could return a > > (kind, detail) > > 2-tuple. FWIW, regardless of the question of `utcoffset()` raising exceptions, the addition of this `classify()` method alone would resolve every use case I've ever had for pytz's `is_dst=None` strict ambiguity handling. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From tim.peters at gmail.com Mon Aug 31 20:53:59 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 31 Aug 2015 13:53:59 -0500 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: Message-ID: [Tim] >> PEP 495 specifies resolving such cases by magic, in essentially >> arbitrary (from the user's point of view) ways. This isn't for >> backward compatibility, because 495-compliant tzinfos don't currently >> exist [Alex] > .. but tzinfo.fromutc() [1] does exist I know - I wrote it ;-) > and the first thing happening in that code is an unguarded call > to utcoffset() on a UTC datetime with a transplanted tzinfo. [2] A > trick like this have been valid in datetime since its introduction > and has been used in numerous places since. But "continuing to never raise an exception" does not imply "continuing to work as intended". All such code needs to be carefully audited in a PEP-495 world to ensure that the problem-case return values work as intended with the old code and the new PEP 495 behavior _if_ a 495-compliant tzinfo object is used. > You know that with a little effort universal .fromutc() can be rewritten to be > PEP 495 compliant. > > If you think allowing datetime.utcoffset() to raise an exception is an > option, let's first see how one can rewrite universal tzinfo.fromutc() for > that design. Even though I don't think reusing a universal .fromutc() is a > good option for tzinfo implementers, it gives an example of simple but > non-trivial datetime manipulation code which PEP 495 is designed not to > break. Eh,. _If_ .utcoffet() could raise exceptions, then the obvious way would be to catch the exceptions and, in each exceptional case, implement whichever behavior is most appropriate in context. That could be used to implement any desired behavior whatsoever. > [1]: https://hg.python.org/cpython/file/v3.5.0rc2/Lib/datetime.py#l957 > [2]: https://hg.python.org/cpython/file/v3.5.0rc2/Lib/datetime.py#l965 BTW, it just occurred to me that PEP 495 has already broken datetime.__hash__. That is, the fundamental invariant a hash implementation must satisfy is: A == B implies hash(A) == hash(B) But, under 495, equality of datetimes ignores `fold`, but .utcoffset() does not, and def __hash__(self): tzoff = self.utcoffset() if tzoff is None: ... else: ... self._hashcode = hash(timedelta(days, seconds, self.microsecond) - tzoff) So two datetimes in a fold, one with fold=0 and the other with fold=1, will compare equal but almost certainly have different hashes. That's a real puzzle. But the kind of error-checking I'm discussing doesn't make that _worse_ ;-) WRT the current __hash__ problem, perhaps it's enough for the PEP to note that ... in some number of cases including hash(), `fold` does a make difference ;-) From tim.peters at gmail.com Mon Aug 31 20:57:18 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 31 Aug 2015 13:57:18 -0500 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: <55E49F27.9000906@oddbird.net> References: <55E49D28.8030204@oddbird.net> <55E49F27.9000906@oddbird.net> Message-ID: [Carl Meyer ] > ... > There wasn't any of that stuff (messing with constructors, or replace, > or combine, or naive time) in what Alex and I were discussing in the > other thread, either. Just the idea of having `utcoffset()` raise an > error if it hit an ambiguity. Ah, my apologies! You're absolutely right. I didn't go back to review Alex's reply in context, and it's a general truth that - in any mailing list - people only remember a few things about what they themselves have written ;-) Thanks for setting the record straight! From alexander.belopolsky at gmail.com Mon Aug 31 20:58:14 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 31 Aug 2015 14:58:14 -0400 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: <55E49F27.9000906@oddbird.net> References: <55E49D28.8030204@oddbird.net> <55E49F27.9000906@oddbird.net> Message-ID: On Mon, Aug 31, 2015 at 2:38 PM, Carl Meyer wrote: [Tim Peters] > > Nope. There's nothing here about, e.g., messing with datetime > > constructors, .replace(), .combine() ... "naive time" is left alone > > here. It's only timezone-specific operations targeted here, which are > > all implemented _by_ tzinfo objects. Not by datetime itself. > > There wasn't any of that stuff (messing with constructors, or replace, > or combine, or naive time) in what Alex and I were discussing in the > other thread, either. Just the idea of having `utcoffset()` raise an > error if it hit an ambiguity. I think the main difference between Tim's current proposal and what was previously discussed is that all older proposals somehow required a third value for fold. Note that there is a third variant suggested by Guido off-list and discussed in the PEP: have fold=-1 by default, ignore it unless it is nonnegative and design whatever you want for fold=0/1 without concerns for backward compatibility. This effectively will give two different datetime classes: classic and new. Both are perfectly consistent, but if you think interoperation between naive and aware is confusing, try to explain how new naive instances will interoperate with classic aware! -------------- next part -------------- An HTML attachment was scrubbed... URL: From carl at oddbird.net Mon Aug 31 21:01:19 2015 From: carl at oddbird.net (Carl Meyer) Date: Mon, 31 Aug 2015 13:01:19 -0600 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: <55E49D28.8030204@oddbird.net> <55E49F27.9000906@oddbird.net> Message-ID: <55E4A47F.3010509@oddbird.net> On 08/31/2015 12:58 PM, Alexander Belopolsky wrote: > On Mon, Aug 31, 2015 at 2:38 PM, Carl Meyer > wrote: > [Tim Peters] > > > Nope. There's nothing here about, e.g., messing with datetime > > constructors, .replace(), .combine() ... "naive time" is left alone > > here. It's only timezone-specific operations targeted here, which are > > all implemented _by_ tzinfo objects. Not by datetime itself. > > There wasn't any of that stuff (messing with constructors, or replace, > or combine, or naive time) in what Alex and I were discussing in the > other thread, either. Just the idea of having `utcoffset()` raise an > error if it hit an ambiguity. > > > I think the main difference between Tim's current proposal and what was > previously discussed is that all older proposals somehow required a > third value for fold. Yes, that's true. That's because (unless I'm misunderstanding) Tim is suggesting something far more audacious than I had considered: making "raise an error on ambiguity" the default behavior, instead of an opt-in choice. The extra value for `fold` was just the opt-in mechanism. Carl -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 836 bytes Desc: OpenPGP digital signature URL: From tim.peters at gmail.com Mon Aug 31 21:07:32 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 31 Aug 2015 14:07:32 -0500 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: <55E4A47F.3010509@oddbird.net> References: <55E49D28.8030204@oddbird.net> <55E49F27.9000906@oddbird.net> <55E4A47F.3010509@oddbird.net> Message-ID: [Alex] >> I think the main difference between Tim's current proposal and what was >> previously discussed is that all older proposals somehow required a >> third value for fold. [Carl] > Yes, that's true. That's because (unless I'm misunderstanding) Tim is > suggesting something far more audacious than I had considered: making > "raise an error on ambiguity" the default behavior, instead of an opt-in > choice. The extra value for `fold` was just the opt-in mechanism. At a high level, I'm questioning the "_never_ raise an exception" PEP 495 behavior. It grates. "Errors should never pass silently" and such. Which of "always raise an exception" or "opt-in" would be better is secondary to me at this point. There's little point to arguing about the low-order bits before there's consensus on the high-order bit. From alexander.belopolsky at gmail.com Mon Aug 31 21:11:01 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 31 Aug 2015 15:11:01 -0400 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: Message-ID: On Mon, Aug 31, 2015 at 2:53 PM, Tim Peters wrote: > BTW, it just occurred to me that PEP 495 has already broken > datetime.__hash__. ... > The PEP did not and in the reference implementation, I am careful to reset fold/first before computing the hash: https://github.com/abalkin/cpython/blob/issue24773/Lib/datetime.py#L1178 -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Aug 31 21:20:18 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 31 Aug 2015 14:20:18 -0500 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: Message-ID: [Tim] >> BTW, it just occurred to me that PEP 495 has already broken >> datetime.__hash__. ... [Alex] > The PEP did not and in the reference implementation, I am careful to reset > fold/first before computing the hash: > > https://github.com/abalkin/cpython/blob/issue24773/Lib/datetime.py#L1178 But you're pointing to time.__hash__ there. I'm talking about datetime.__hash__. You replace `first` there too, but _only_ if .utcoffset() returns None: def __hash__(self): if self._hashcode == -1: tzoff = self.utcoffset() if tzoff is None: self._hashcode = hash(self.replace(first=True)._getstate()[0]) else: days = _ymd2ord(self.year, self.month, self.day) seconds = self.hour * 3600 + self.minute * 60 + self.second self._hashcode = hash(timedelta(days, seconds, self.microsecond) - tzoff) return self._hashcode So it's the case that two datetimes that compare true may have different hashes, when they represent the earlier and later times in a fold. I didn't say "it's a puzzle" lightly ;-) From alexander.belopolsky at gmail.com Mon Aug 31 21:23:59 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 31 Aug 2015 15:23:59 -0400 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: <55E49D28.8030204@oddbird.net> <55E49F27.9000906@oddbird.net> <55E4A47F.3010509@oddbird.net> Message-ID: On Mon, Aug 31, 2015 at 3:07 PM, Tim Peters wrote: > At a high level, I'm questioning the "_never_ raise an exception" PEP > 495 behavior. It grates. "Errors should never pass silently" and > such. > But these are not errors! As I mentioned before, it was bad PR on my part to call datetimes in the gaps or with ignorable fold=1 "invalid." I should have called them "denormalized." I believe many aware datetime manipulation algorithms can benefit from having denormalized instances as intermediate values and being able to call .utcoffset() and friends on such instances. My primary use case is the "naive scheduler" which gives you no means to schedule anything with fold=1 and if you give it 02:45 AM in the gap it will silently take it for 03:45 AM. As long as it displays the correct time in every reminder, I don't care that it did to chastise me for not knowing about the DST gap. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Mon Aug 31 21:33:21 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 31 Aug 2015 15:33:21 -0400 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: Message-ID: On Mon, Aug 31, 2015 at 3:20 PM, Tim Peters wrote: > def __hash__(self): > if self._hashcode == -1: > tzoff = self.utcoffset() > if tzoff is None: > self._hashcode = > hash(self.replace(first=True)._getstate()[0]) > else: > days = _ymd2ord(self.year, self.month, self.day) > seconds = self.hour * 3600 + self.minute * 60 + self.second > self._hashcode = hash(timedelta(days, seconds, > self.microsecond) - tzoff) > return self._hashcode > > So it's the case that two datetimes that compare true may have > different hashes, when they represent the earlier and later times in a > fold. I didn't say "it's a puzzle" lightly ;-) > Yes, it looks like I have a bug there, but isn't fixing it just a matter of moving self.replace(first=True) up two lines? Is there a bigger puzzle? Certainly x == y ? hash(x) == hash(y) is the implication that I intend to preserve in all cases. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Aug 31 21:50:24 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 31 Aug 2015 14:50:24 -0500 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: <55E49D28.8030204@oddbird.net> <55E49F27.9000906@oddbird.net> <55E4A47F.3010509@oddbird.net> Message-ID: [Tim] >> At a high level, I'm questioning the "_never_ raise an exception" PEP >> 495 behavior. It grates. "Errors should never pass silently" and >> such. [Alex][ > But these are not errors! That depends on the application. As W Kahan quipped when devising the IEEE-754 exception model, they're called "exceptions" because _whatever_ you do someone will take exception to it ;-) > As I mentioned before, it was bad PR on my part > to call datetimes in the gaps or with ignorable fold=1 "invalid." I should > have called them "denormalized." It doesn't matter what they're called _except_ for "PR purposes". I'm not trying to do PR here. > I believe many aware datetime manipulation algorithms can benefit from > having denormalized instances as intermediate values and being able > to call .utcoffset() and friends on such instances. That's why, e.g., C's mktime allows "insane" values for the day, etc. Python's internal C datetime implementation does too. It's also why Paul Eggert patched glibc's mktime to "pick one" for a gap case.. But that's all C, working at a level so low everything is painful. datetime isn't intended to be infinitely painful ;-) > My primary use case is the "naive scheduler" which gives you no means to > schedule anything with fold=1 and if you give it 02:45 AM in the gap it will > silently take it for 03:45 AM. As long as it displays the correct time in > every reminder, I don't care that it did to chastise me for not knowing > about the DST gap. I don't dispute that, in a gap case, picking the time that "would have displayed" had the user moved the clock forward is most useful most often. If that's believed to be overwhelmingly the case, that at least argues for opt-in exceptions. But passing fold=1 in a non-fold/gap case is almost certainly a flat-out programmer error, _except_ that PEP 495 requires never ever complaining about it, and to the contrary _requires_ it to be done if the user wants to find out whether their datetime is any way odd. C'mon, admit it - _part_ of it makes you cringe too, if even just a little ;-) From alexander.belopolsky at gmail.com Mon Aug 31 21:54:15 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 31 Aug 2015 15:54:15 -0400 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: Message-ID: On Mon, Aug 31, 2015 at 3:33 PM, Alexander Belopolsky < alexander.belopolsky at gmail.com> wrote: > On Mon, Aug 31, 2015 at 3:20 PM, Tim Peters wrote: > >> def __hash__(self): >> if self._hashcode == -1: >> tzoff = self.utcoffset() >> if tzoff is None: >> self._hashcode = >> hash(self.replace(first=True)._getstate()[0]) >> else: >> days = _ymd2ord(self.year, self.month, self.day) >> seconds = self.hour * 3600 + self.minute * 60 + >> self.second >> self._hashcode = hash(timedelta(days, seconds, >> self.microsecond) - tzoff) >> return self._hashcode >> >> So it's the case that two datetimes that compare true may have >> different hashes, when they represent the earlier and later times in a >> fold. I didn't say "it's a puzzle" lightly ;-) >> > > Yes, it looks like I have a bug there, but isn't fixing it just a matter > of moving self.replace(first=True) up two lines? Is there a bigger > puzzle? Certainly x == y ? hash(x) == hash(y) is the implication that I > intend to preserve in all cases. I think I admitted defeat too soon. Can you present a specific case where "two datetimes that compare true have different hashes"? There may be some subtlety due to the fact that we ignore tzinfo in == if it is the same for both sides, but when we compute hash(), we don't know what's on the other side. It is hard to tell without a specific example. I thought I got it right when I wrote the code above, but it is possible I missed some case. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Aug 31 22:03:19 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 31 Aug 2015 15:03:19 -0500 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: Message-ID: >>> def __hash__(self): >>> if self._hashcode == -1: >>> tzoff = self.utcoffset() >>> if tzoff is None: >>> self._hashcode = >>> hash(self.replace(first=True)._getstate()[0]) >>> else: >>> days = _ymd2ord(self.year, self.month, self.day) >>> seconds = self.hour * 3600 + self.minute * 60 + >>> self.second >>> self._hashcode = hash(timedelta(days, seconds, >>> self.microsecond) - tzoff) >>> return self._hashcode > ... > I think I admitted defeat too soon. Can you present a specific case where > "two datetimes that compare true have different hashes"? Two aware datetimes in a single zone representing the earlier and later ambiguous time. All fields (including tzinfo) are identical except for fold. "==" says True. But `tzoff` differs between them, so the code above passes different values to `hash()`. It's not guaranteed that the hashes differ, but it's very likely they differ. > There may be some subtlety due to the fact that we ignore tzinfo in == > if it is the same for both sides, That's why they compare equal in this case. > but when we compute hash(), we don't know what's on the other > side. It is hard to tell without a specific example. I thought I got it > right when I wrote the code above, but it is possible I missed some case. It's a puzzle ;-) From alexander.belopolsky at gmail.com Mon Aug 31 22:13:10 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 31 Aug 2015 16:13:10 -0400 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: <55E49D28.8030204@oddbird.net> <55E49F27.9000906@oddbird.net> <55E4A47F.3010509@oddbird.net> Message-ID: On Mon, Aug 31, 2015 at 3:50 PM, Tim Peters wrote: > But passing fold=1 in a > non-fold/gap case is almost certainly a flat-out programmer error, > _except_ that PEP 495 requires never ever complaining about it, and to > the contrary _requires_ it to be done if the user wants to find out > whether their datetime is any way odd. > A programmer who knows about fold does not need as much hand-holding as the one who does not. My customer is the ignorant programmer. > C'mon, admit it - _part_ of it makes you cringe too, if even just a little > ;-) > What makes me cringe is the problem domain where I am required to implement an inverse to a non-monotonic function. And that's on top of the world where clock hands can move backwards. -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Aug 31 22:17:27 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 31 Aug 2015 15:17:27 -0500 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: Message-ID: >> def __hash__(self): >> if self._hashcode == -1: >> tzoff = self.utcoffset() >> if tzoff is None: >> self._hashcode = >> hash(self.replace(first=True)._getstate()[0]) >> else: >> days = _ymd2ord(self.year, self.month, self.day) >> seconds = self.hour * 3600 + self.minute * 60 + >> self.second >> self._hashcode = hash(timedelta(days, seconds, >> self.microsecond) - tzoff) >> return self._hashcode >> >> So it's the case that two datetimes that compare true may have >> different hashes, when they represent the earlier and later times in a >> fold. I didn't say "it's a puzzle" lightly ;-) [Alex] > Yes, it looks like I have a bug there, but isn't fixing it just a matter of > moving self.replace(first=True) up two lines? Is there a bigger puzzle? > Certainly x == y ? hash(x) == hash(y) is the implication that I intend to > preserve in all cases. Yes, there's a bigger puzzle: datetimes expressed in different timezones can also compare equal. Conceptually, they're converted to UTC before comparison - and so also, to maintain the crucial hash invariant, before being hashed. That can't work right without using their actual UTC offsets (i.e,, `first` can't be ignored for interzone equality, but would be ignored for hashes if forcing `first` to 1 were done before extracting the offset). The real problem here is that this stuff just barely managed to work from the start ;-) In effect, for the purpose of hashing, _all_ datetimes are converted to UTC first now. That didn't interfere with the "naive time" view before because all possible insanities were blithely ignored in all contexts before. The easiest way out of this particular puzzle is, I believe, to say that two datetimes identical except for `fold` do _not_ compare equal. `fold` breaks the tie in the obvious way (the one with fold==1 is "greater"). Then __hash__ can continue using the real UTC offset, just as before. If a user doesn't force `fold` to 1, then no existing code will change behavior, at least until they start using 495 tzinfos. Then `fold=1` can start appearing "by magic" via .fromutc(). From alexander.belopolsky at gmail.com Mon Aug 31 22:24:42 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 31 Aug 2015 16:24:42 -0400 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: Message-ID: On Mon, Aug 31, 2015 at 4:17 PM, Tim Peters wrote: > The easiest way out of this particular puzzle is, I believe, to say > that two datetimes identical except for `fold` do _not_ compare equal. > `fold` breaks the tie in the obvious way (the one with fold==1 is > "greater"). > I am afraid you are right, but proving that we will not break naive (fold unaware) programs will be harder in this case. Let me think some more about this. Meanwhile, would you see any problem with not(x - y) not implying x == y? -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Aug 31 22:38:53 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 31 Aug 2015 15:38:53 -0500 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: Message-ID: [Tim] >> The easiest way out of this particular puzzle is, I believe, to say >> that two datetimes identical except for `fold` do _not_ compare equal. >> `fold` breaks the tie in the obvious way (the one with fold==1 is >> "greater"). [Alex] > I am afraid you are right, but proving that we will not break naive (fold > unaware) programs will be harder in this case. Let me think some more > about this. > > Meanwhile, would you see any problem with not(x - y) not implying x == y? Which is another puzzle :-( It's very intentional now that dt1 == dt2 if and only if dt1 - dt2 == timedelta(0) Here's a related puzzle, if comparison used `fold` to break ties: y = x + timedelta(0) If x had first=1, y will have fold=0, and then x != y. In all, maybe it's better to leave __hash__ slightly broken. From alexander.belopolsky at gmail.com Mon Aug 31 22:46:58 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 31 Aug 2015 16:46:58 -0400 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: Message-ID: On Mon, Aug 31, 2015 at 4:38 PM, Tim Peters wrote: > [Tim] > >> The easiest way out of this particular puzzle is, I believe, to say > >> that two datetimes identical except for `fold` do _not_ compare equal. > >> `fold` breaks the tie in the obvious way (the one with fold==1 is > >> "greater"). > > [Alex] > > I am afraid you are right, but proving that we will not break naive (fold > > unaware) programs will be harder in this case. Let me think some more > > about this. > > > > Meanwhile, would you see any problem with not(x - y) not implying x == y? > > Which is another puzzle :-( It's very intentional now that > > dt1 == dt2 if and only if dt1 - dt2 == timedelta(0) > > Here's a related puzzle, if comparison used `fold` to break ties: > > y = x + timedelta(0) > > If x had first=1, y will have fold=0, and then x != y. > > In all, maybe it's better to leave __hash__ slightly broken. > After some thought, I believe the way to fix the implementation is what I suggested at first: reset fold to 0 before calling utcoffset() in __hash__. A rare hash collision is a small price to pay for having datetimes with different timezones in the same dictionary. Now, please, can we not start discussing how __hash__ should behave if utcoffset() raises a MissingTimeError? -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.peters at gmail.com Mon Aug 31 23:15:00 2015 From: tim.peters at gmail.com (Tim Peters) Date: Mon, 31 Aug 2015 16:15:00 -0500 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: Message-ID: [Alex] > After some thought, I believe the way to fix the implementation is what I > suggested at first: reset fold to 0 before calling utcoffset() in __hash__. > A rare hash collision is a small price to pay for having datetimes with > different timezones in the same dictionary. Ya, I can live with that. In effect, we give up on converting to UTC correctly for purposes of computing hash(), but only in rare cases. hash() doesn't really care, and it remains true that datetime equality (which does care) still implies hash equality. The later and earlier of ambiguous times will simply land on the same hash chain. > Now, please, can we not start discussing how __hash__ should behave if > utcoffset() raises a MissingTimeError? __hash__ is a tail. I still want to get people thinking about the dog. Really. Let's give people some space to look at it from a higher level? Implementation details aren't users' problems, and I want to be (more) sure we can live with the high-level model. From alexander.belopolsky at gmail.com Mon Aug 31 23:29:49 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 31 Aug 2015 17:29:49 -0400 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: Message-ID: On Mon, Aug 31, 2015 at 2:53 PM, Tim Peters wrote: > But "continuing to never raise an exception" does not imply > "continuing to work as intended". All such code needs to be carefully > audited in a PEP-495 world to ensure that the problem-case return > values work as intended with the old code and the new PEP 495 behavior > _if_ a 495-compliant tzinfo object is used. > My design goals with respect to aware datetime objects were (1) objects with pre-PEP tzinfo should work exactly the same as they did before (2) objects with post-PEP tzinfo may produce different results when used with pre-PEP code, but these differences should be limited to the problem cases (gaps and folds) where new behavior can be defended as a bug fix. I find it unacceptable for a program that switched to post-PEP tzinfo to crash several years after that because it was not sufficiently well tested on problem times. In most applications, confusing the first and the seconds 01:30 AM is a forgivable error. Applications that cannot tolerate it should not use local times or should be carefully audited for PEP 495 compliance. However, even applications that can tolerate a random choice between one 01:30AM and another usually cannot tolerate a server crash at 02:45 AM. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alexander.belopolsky at gmail.com Mon Aug 31 23:42:21 2015 From: alexander.belopolsky at gmail.com (Alexander Belopolsky) Date: Mon, 31 Aug 2015 17:42:21 -0400 Subject: [Datetime-SIG] Another round on error-checking In-Reply-To: References: Message-ID: On Mon, Aug 31, 2015 at 5:15 PM, Tim Peters wrote: > > Now, please, can we not start discussing how __hash__ should behave if > > utcoffset() raises a MissingTimeError? > > __hash__ is a tail. I still want to get people thinking about the > dog. Really. Let's give people some space to look at it from a > higher level? Implementation details aren't users' problems, and I > want to be (more) sure we can live with the high-level model. This forum may not be inclusive enough for this. People in this group know too much! I plan to cross some t's and dot some i's in the PEP text this week and complete the "big rename" in the reference implementation. After that, I will be ready to present the PEP for a final round on Python-Dev. Granted, Python-Dev is also a rather elite group, but I hope we will find a few people there who have used time.mktime(dt.timetuple()) in their programs and never thought there was anything wrong with their code. -------------- next part -------------- An HTML attachment was scrubbed... URL: