[Numpy-discussion] Making datetime64 timezone naive

Chris Barker chris.barker at noaa.gov
Mon Oct 19 15:34:56 EDT 2015


On Sun, Oct 18, 2015 at 12:20 PM, Alexander Belopolsky <ndarray at mac.com>
wrote:

>
> On Sat, Oct 17, 2015 at 6:59 PM, Chris Barker <chris.barker at noaa.gov>
> wrote:
>
>> Off the top of my head, I think allowing a 60th second makes more sense
>> -- jsut like we do leap years.
>
>
> Yet we don't implement DST by allowing the 24th hour.  Even the countries
> that adjust the clocks at midnight don't do that.
>

Well, isn't that about conforming to already existing standards? DST is a
civil construct -- and mst (all?) implementations use the convention of
having repeated times. -- so that's what software has to deal with.

IIUC, at least +some+standards handle leap seconds by adding a 60th (61st)
second, rather than having a repeated one. So it's at least an option to do
it that way. And it can then fit into the already existing standards for
representing datetimes, etc.

Does the "fold" flag approach for representing, well, "folds" exist in a
widely used standards? It's my impression that it doesn't since we had to
argue a lot about what to call it :-)


> In some sense leap seconds are more similar to timezone changes (DST or
> political) because they are irregular and unpredictable.
>

in that regard, yes -- you need a constantly updating database to use them.
but I don't know that that has any impact on how you represent them. They
seem a lot more like leap years to me -- some februaries have a 29th day --
some hours on some days have a 61st second.


> Furthermore, the notion of "fold" is not tied to a particular 24/60/60
> system of encoding times and thus more applicable to numpy where
> times are encoded as binary integers.
>

but there are no folds in the underlying integer representation -- that is
the "continuous" time scale -- the folds (or leap seconds, or leap years,
or any of the 24/60/60 business comes in only when you want to go to-from
the "datetime" representation.

If anyone decides to actually get around to leap seconds support in numpy
> datetime, s/he can decide ...


This attitude is the reason why we will probably never have bug free
software when it comes to civil time reckoning.

OK -- fair enough -- good to think about it sooner than later.


Similarly, current numpy.datetime64 design ties arithmetic with encoding.
> This makes arithmetic easier, but in the long run may preclude designs that
> better match the problem domain.


I don't follow here -- how can you NOT tied arithmetic to encoding? sure
you could decide that you are going to overload the arithmetic, and it's up
t the object that encodes the data to do that math -- but that's pretty
much what datetime64 is doing -- defining an encoding so that it can do
math -- numpy dtypes are very much about binary representation. No reason
one couldn't make a different numpy dtype for datetimes that encoded it a
different way, and then it would have to implement math, too.



Note how the development of PEP 495 has highlighted the fact that allowing
binary operations (subtraction, comparison etc.) between times in different
timezones was a design mistake.  It will be wise to learn from such
mistakes when redesigning numpy.datetime64.

So was not considering folds -- frankly, and I this this may be your point,
I don't think timezones were well thought out at all when datetime
was first introduced -- and however well thought out it was, if you don't
provide an implementation, you are not going to find the limitations. And
despite Tim's articulate defense of the original impp;imentation decisions,
I think encoding the datetime in the local "calendar/clock" just invites a
mess. And I'm quite convinced that it wouldn't be a the way to go for numpy
use-cases.

If you ever plan to support civil time in some form, you should think about
it now.

well, the goal for now is naive time -- and unlike the original datetime --
we are not adding on a "you can implement your own timezone handling this
way" hook yet.

> In Python 3.6, datetime.now() will return different values in the first
and the second repeated hour in the "fall-back fold." > If you allow
datetime.datetime to numpy.datetime64 conversion, you should decide what
you do with that difference.

Indeed. Though will that only occur with timezones that have DST? I know
I'd be fine with NOT being able to create a numpy datetime64 from a
non-naive datetime object.  Which would force the user to think about and
convert to the timezone they want before passing off to numpy.

Unless you can suggest a sensible default way to handle this. At first
blush, I think naive time does not have folds, so there is no way to handle
them "properly"

Also -- I think we are at phase one of a (at least) two step process:

1) clean up datetime64 just enough that it is useful, and less error-prone
-- i.e. have it not pretend to support anything other than naive datetimes.

2) Do it right -- perhaps adding some time zone support. This is going to
wait until the numpy dtype machinery is cleaned up some.

Phase 2 is where we really need the thinking ahead. And I'm still confused
about what thinking ahead needs to be done for potential leap second
support.

-CHB

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20151019/fbaba807/attachment.html>


More information about the NumPy-Discussion mailing list