[Numpy-discussion] Time series: lag function

Mon Dec 25 17:07:12 EST 2006

I am  tryin to convert some of my time-series code written in Ox to
scipy/numpy (e.g., unit root tests, IRFs, cointegration, etc). Two key
functions I need for this are 'lag' and 'diff'. 'diff' is available but
'lag' is apparently not.

Below is my attempt at a lag function. I tried to be somewhat consistent
with the diff function which is part of numpy (also listed for convenience).
It seems to work fine for a 2-d array but not for a 1-d or 3-d array (see
tests at bottom of email). I'd appreciate any suggestions you may have.

Thanks,

Vincent

from numpy import *

def diff(a, n=1, axis=-1):
    """Calculate the nth order discrete difference along given axis.
    """
    if n == 0:
        return a
    if n < 0:
        raise ValueError, 'order must be non-negative but got ' + repr(n)
    a = asanyarray(a)
    nd = len(a.shape)
    slice1 = [slice(None)]*nd
    slice2 = [slice(None)]*nd
    slice1[axis] = slice(1, None)
    slice2[axis] = slice(None, -1)
    slice1 = tuple(slice1)
    slice2 = tuple(slice2)
    if n > 1:
        return diff(a[slice1]-a[slice2], n-1, axis=axis)
    else:
        return a[slice1]-a[slice2]

def lag(a, n=1, lag_axis=0, concat_axis=1):
    """
    Calculate the nth order discrete lag along given axis.
    Note: axis=-1 means 'last dimension'. This is the default
    for the diff function. However, the first dimension (0)
    may be preferred for time-series analysis.
    """
    a = asanyarray(a)

    n = ravel(n)                # convert input to an array
    nmax = n.max()              # determines length of array to be returned

    nd = len(a.shape)           # number of dimentions in array a
    s = [slice(None)]*nd

    # lag for 1st element in n
    s[lag_axis] = slice(nmax-n[0],-n[0])
    ret_a = a[tuple(s)]         # array to be returned

    # lags for other elements in n
    for i in n[1:]:
        s[lag_axis] = slice(nmax-i,-i)
        ret_a = concatenate((ret_a,a[tuple(s)]), concat_axis)

    return ret_a

# testing lag function
# test 1
data = arange(10)

print "=" * 30
print "test 1 - data"
print data

print "\nlag 2"
print lag(data,2)
print "\nlag 1,2,3"
print lag(data,range(1,4))
print "=" * 30 + "\n"

# test 2
data = arange(10)
data = vstack((data,data)).T

print "=" * 30
print "test 2 - data"
print data

print "\nlag 2"
print lag(data,2)
print "\nlag 1,2,3"
print lag(data,range(1,4))
print "=" * 30 + "\n"

# test 3
data = arange(10)
data = vstack((data,data)).T
data = dstack((data,data))

print "=" * 30
print "test 3 - data"
print data

print "\nlag 2"
print lag(data,2)
print "\nlag 1,2,3,"
print lag(data,range(1,4))
print "=" * 30 + "\n"