[Numpy-discussion] datetimes with date vs time units, local time, and time zones

Mark Wiebe mwwiebe at gmail.com
Wed Jun 15 16:40:51 EDT 2011


Towards a reasonable behavior with regard to local times, I've made the
default repr for datetimes use the C standard library to print them in a
local ISO format. Combined with the ISO8601-prescribed behavior of
interpreting datetime strings with no timezone specifier to be in local
times, this allows the following cases to behave reasonably:

>>> np.datetime64('now')
numpy.datetime64('2011-06-15T15:16:51-0500','s')

>>> np.datetime64('2011-06-15T18:00')
numpy.datetime64('2011-06-15T18:00-0500','m')

As noted in another thread, there can be some extremely surprising behavior
as a consequence:

>>> np.array(['now', '2011-06-15'], dtype='M')
array(['2011-06-15T15:18:26-0500', '2011-06-14T19:00:00-0500'],
dtype='datetime64[s]')

Having the 15th of June print out as 7pm on the 14th of June is probably not
what one would generally expect, so I've come up with an approach which
hopefully deals with this in a good way.

One firm principal of the datetime in NumPy is that it is always stored as a
POSIX time (referencing UTC), or a TAI time. There are two categories of
units that can be used, which I will call *date units* and *time units*. The
date units are 'Y', 'M', 'W', and 'D', while the time units are 'h', 'm',
's', ..., 'as'. Time zones are only applied to datetimes stored in time
units, so there's a qualitative difference between date and time units with
respect to string conversions and calendar operations.

I would like to place an 'unsafe' casting barrier between the date units and
the time units, so that the above conversion from a date into a datetime
will raise an error instead of producing a confusing result. This only
applies to datetimes and not timedeltas, because for timedeltas the day <->
hour case is fine, it is just the year/month <-> other units which has
issues, and that is already treated with an 'unsafe' casting barrier.

Two new functions will facilitate the conversions between datetimes with
date units and time units:

date_as_datetime(datearray, hour, minute, second, microsecond,
timezone='local', unit=None, out=None), which converts the provided dates
into datetimes at the specified time, according to the specified timezone.
If 'unit' is specified, it controls the output unit, otherwise it is the
units in 'out' or the amount of precision specified in the function.

datetime_as_date(datetimearray, timezone='local', out=None), which converts
the provided datetimes into dates according to the specified timezone.

In both functions, timezone can be any of 'UTC', 'TAI', 'local', '+/-####',
or a datetime.tzinfo object. The latter will allow NumPy datetimes to work
with the pytz library for flexible time zone support.

I would also like to extend the 'today' input string parsing to accept
strings like 'today 12:30' to allow a convenient way to express different
local times occurring today, mostly useful for interactive usage.

I welcome any comments on this design, particularly if you can find a case
where this doesn't produce a reasonable behavior.

Cheers,
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110615/c3303d71/attachment.html>


More information about the NumPy-Discussion mailing list