[Numpy-discussion] fixing up datetime

Christopher Barker Chris.Barker at noaa.gov
Mon Jun 6 14:33:42 EDT 2011


Mark Wiebe wrote:
> I'm wondering if removing the business-day unit from datetime64, and 
> adding a business-day API would be a good approach to support all the 
> things that are needed?

That sounds like a good idea to me -- and perhaps it could be a general 
Calendar Functions API, to handle other issues as well.

Looking again at the NEP, I see "business day" as a time unit, then 
below, that "B" is interpreted as "24h, 1440m, 86400s" i.e. a day.

This seems like a really bad idea -- it implies that you can convert 
from business day to hours, and back again, but, of course, if you do:


start_date + n business_days

You should not get the same thing as:

start_date + (n * 24) hrs


which is why I think that a "business day" is not really a unit -- it is 
a specification for a Calendar operation.

My thought is that a datetime should not be able to be expressed in 
units like business days. A timedelta should, but then there needs to be 
a Calendar(i.e. a set of rules) associated with it, and it should be 
clearly distinct from "linear unit" timedeltas.


Business days aside, I also see that months and years are given 
"Interpreted as" definitions:

Code 	Interpreted as
Y 	12M, 52W, 365D
M 	4W, 30D, 720h

This is even self inconsistent:

1Y == 365D

1Y == 12M == 12 * 30D == 360D

1Y == 12M == 12 * 4W == 12 * 4 * 7D == 336D

1Y == 52W == 52 * 7D == 364D

Is it not clear from this what a mess of mis-interpretation might result 
from all that?


In thinking more, numpy's needs are a little different that the netcdf 
standards -- netcdf is a way to transmit and store information -- some 
of the key problems that come up is that people develop tools to do 
things with the data that make assumptions. For instance, my tools that 
work with CF-compliant netcdf pretty much always convert the "time" axis 
(expresses as time_units since a_datetime) into python datetime objects, 
and then go from there.

This goes to heck is the data is expressed in something like "months 
since 1995-01-01"

Because months are only defined on a Calendar.

Anyway, this kind of thing may be less of an issue because we use numpy 
to write the tools, not to store and share the data, so hopefully, the 
tool author knows what she/he is working with. But I still think the 
distinction between linear, convertible, time units, and time units that 
vary depending on where you are on which calendar, and what calendar you 
are using, should be kept clearly distinct.


-Chris




-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov



More information about the NumPy-Discussion mailing list