[SciPy-User] [ANN] scikit.statsmodels 0.2.0 release (Skipper Seabold)

Leon Sit wing1127aishi at gmail.com
Fri Feb 19 11:13:49 EST 2010


Dear all:

I am not a specialist in statistical method but I have rewritten some
code on econometric models at http://github.com/wingsit/KF. Since I
saw at somewhere that the statmodels authors are interested in
econometric models, I just want to put my code on the table and see if
it is useful to anybody.

The project name 'Kalman Filter' is a bit misleading because it
contains other regression techniques to solve the state space model.

Some summary of the library
-----------------------------------------------------------------------------------------
This project was started to test different available tools to track
mutual funds and hedge fund using Capital Asset
Pricing Model (CAPM thereafter) introduced my Sharpe and Arbitrage
Pricing Theory (APT thereafter) introduced by Ross.
The purpose of the CAPM model is to use a set of generic investable
market indices to decompose the exposure of a fund.
On the other hand, APT model is to find a set of generic indices which
are not necessarily investable, to decompose the
behaviour of an indices that we wish to analyse.

The License for this project is BSD.

Documentation:
http://packages.python.org/KF/

Dependency:

This project requires
Scipy (http://www.scipy.org/)
Numpy (http://numpy.scipy.org/)
CVXOPT (http://abel.ee.ucla.edu/cvxopt/)

Optional
Matplotlib (http://matplotlib.sourceforge.net/)

The primary data structure is DataFrame in timeSeriesFrame.py although
this is not used directly. The container for time
series data is TimeSeriesFrame which are used by Regression.
Essentially, this is just a data structure with a
scipy.matrix for tabular data and 2 lists which contains row and
column header information. Later in the future I might
change it it numpy.maskedarray to handle missing. The primary features
for TimeSeriesFrame are simple slicing and
iterators in both dimensions. This was designed in a way such that it
is easy to run operation on each time series or
each date. Later I might subclass it with
http://scikits.appspot.com/scikits but this is not in near future.

Regression is a base class for all the estimation procedures. It is
implemented as basic linear least squares
estimation. Primary sub-classes of Regression is ECRegression and
ICRegression which are base-class for equality and
inequality constrained model, respectively. Sample library usage is

>>>    stock_data = list(csv.reader(open("simulated_portfolio.csv", "rb"))) #Read data from file
>>>    stock = StylusReader(stock_data) #Put the data into TimeSeriesFrame
>>>    respond = stock[:,0] #slice the first column of TimeSeriesFrame as respond
>>>    regressors = stock[:,1:] #slice the columns after first column as regressors
>>>    t,n = regressors.size()
>>>    weight = scipy.identity(t)
>>>    intercept = True
>>>    obj = Regression(respond, regressors, intercept, weight = weight) #Initialise the regression object, construct
all the necessary matrix according to the value of intercept and settings
>>>    obj.train() #perform estimation. In this case it is simple matrix calculation and inversion.
>>>    print obj.predict()  #Return the estimate in TimeSerisFrame
>>>    print obj.error() #Return the estimation error in TimeSeriesFrame

Essentially, users only need to run the constructor and train() to get
the estimate. It was designed to be very simple
to use for end users.

Naming Convention:
regression.py imposes no state constraints on the model
ec*.py imposes Equality Constraints on the model. For example,
ecRegression.py is a regression model which must satisfy
Equality Constraints.
ic*.py imposes Inequality Constraints and Equality Constraints. For
example icRegression.py is a constrained regression
model which has both equality and inequality constraints.

Organisation: Most of the computation functions are located in libregression.py



Please feel free to drop a comment, forward this project and contribute!!

Contact: Wing H. Sit (wing1127aishi at gmail.com)
Something about me: A mathematically oriented person who wants to
learn coding, finance, and software development.
-----------------------------------------------------------------------------------------

Thanks

Leon



More information about the SciPy-User mailing list