[Numpy-discussion] The date/time dtype and the casting issue

Matt Knox mattknox.ca at gmail.com
Wed Jul 30 22:45:36 EDT 2008


>> >> If it's really just weekdays why not call it that instead of using a
>> >> term like business days that (quite confusingly) suggests holidays
>> >> are handled properly?
>>
>> Well, we were adopting the name from the TimeSeries package.  Perhaps 
>> the authors can answer this better than me.

A lot of the inspiration for the original prototype of the timeseries module
came from FAME (http://www.sungard.com/Fame/). The proprietary FAME 4GL
language does a lot of things well when it comes to time series analysis, but
is (not surprisingly) very lacking as a general purpose programming language.
Python was the glue language I was using at work, and naturally I wanted to do
a lot of the stuff I could do in FAME using Python instead. Most of the
frequencies in the timeseries package are named the same as their FAME
counterparts. I'm not especially attached to the name "business" instead of
"weekday" for the frequency, it is just what I was used to from FAME so I went
with it. I won't lose any sleep if you decide to call it "weekday" instead.

While on the topic of FAME... being a financial analyst, I really am quite
fond of the multitude of quarterly frequencies we have in the timeseries
package (with different year end points) because they are very useful when
doing things like "calenderizing" earnings from companies with different
fiscal year ends. These frequencies are included in FAME, which makes sense
since it targets financial users. I know Pierre likes them too for working
with different seasons. I think it would be ok to leave them out of an initial
implementation, but it might be worth keeping in mind during the design phase
about how the dtype could be extended to incorporate such things.

>> As forbidding operations among absolute/absolute and relative/relative 
>> types can be unacceptable in many situations, we are proposing an 
>> explicit casting mechanism so that the user can inform about the 
>> desired time unit of the outcome.  For this, a new NumPy function, 
>> called, say, ``numpy.change_unit()`` (this name is for the purposes of 
>> the discussion and can be changed) will be provided.  The signature for 
>> the function will be:
>> 
>> change_unit(time_object, new_unit, reference)
>> 
>> where 'time_object' is the time object whose unit is to be 
>> changed, 'new_unit' is the desired new time unit, and 'reference' is an 
>> absolute date that will be used to allow the conversion of relative 
>> times in case of using time units with an uncertain number of smaller 
>> time units (relative years or months cannot be expressed in days).  For 
>> example, that would allow to do:
>> 
>> >>> numpy.change_unit( numpy.array([1,2], 'T[Y]'), 'T[d]' )
>> array([365, 731], dtype="datetime64[d]")

If I understand you correctly, this is very close to the "asfreq" method of
the Date/DateArray/TimeSeries classes in the timeseries module. One key
element missing here (from my point of view anyway) is an equivalent of the
'relation' parameter in the asfreq method in the timeseries module. This is
only used when converting from a lower frequency to a higher frequency (eg.
annual to daily). For example...

>>> a = ts.Date(freq='Annual', year=2007)
>>> a.asfreq('Daily', 'START')
<D : 01-Jan-2007>
>>> a.asfreq('Daily', 'END')
<D : 31-Dec-2007>

This is another one of those things that I use all the time. Now whether it
belongs in the core dtype, or some extension module I'm not sure... but it's
an important feature in the timeseries module.





More information about the NumPy-Discussion mailing list