[scikit-learn] Efficient forward stepwise regression

Matt Schoenbauer matt.schoenbauer3 at gmail.com
Thu Apr 22 12:37:26 EDT 2021

Hello sklearn developers,

I'd like to implement a forward stepwise regression algorithm using the
efficient procedure described in the first problem here
<http://stat.rutgers.edu/home/hxiao/stat588_2011/hw1.pdf>. It does not seem
that such a model exists anywhere in Python. Would it be useful for me to
write this model up for sklearn?

If you're interested, here's a high-level view of how I think it would work:

- The model would have sklearn.linear_model.LinearRegression as its base
- The additional model parameters would include

   - An array of the indices (or column names) of the features in X1
   - The Q and R matrices

- The additional methods would include

   - An add_features() method that adds a specified number of features to
   the model. Updates all model parameters
   - A fit() method that requires a specification of the number of
   parameters to fit and optional sample weight. It calls the add_features
   method once on a model with no features.

I would do this for OLS first, but supposedly it could be adapted for
regularized models as well.

How does this sound?


Matt S.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20210422/f75649fb/attachment.html>

More information about the scikit-learn mailing list