[scikit-learn] [GSoC 2017] First Draft, request for suggestions - Improve Online Learning of Linear Models.

Wed Mar 15 04:48:28 EDT 2017

Hello developers,

I'm Karan Desai, an Electrical Engineering Undergraduate at IIT Roorkee. I
was following the community since October and initially planned to work on
Pytest Migration idea. But on meticulous discussions, it was concluded that
the migration  task might be short for a three month wide timeline. Besides
work is in progress on that.

I particularly found the first project idea appealing, and went about
gathering ingredients to make the perfect recipe for summers. Finally I can
outline it as stated below. The description was quite short, so I will be
happy to include more in it if need be.

1. There's a gradient descent optimizer, but I could not find an optimizer
for adaptive learning strategies (I saw a method for adam in MLP though).
So adding that can be a part of my project.

2. I looked into benchmarks directory, and checked a comparison of SGD
against coordinate descent and ridge regression. Similar type of benching
should be done with this new Optimizer/s as well.

3. There's a lack of multinomial logloss as mentioned in description
(categorical cross entropy for classification tasks). I can work on adding
that as well. As an addition, I can work on KL divergence, poisson and
cosine proximity losses, to name a few. In my opinion, these are pretty
standard and can be a nice to have. They already exist as metrics, just
need to be ported to Cython and used as an optimization objective for
linear classifiers.

4. About a tool to anneal learning rate: I suggest a new approach to look
at this - as a callback. I searched through the documentation and I could
not find this way of handling tidbits during training of models. We should
be able to provide a callback to the constructor of a linear model which
can do any dedicated job after every epoch, be it learning rate annealing,
saving model checkpoint, getting custom verbose output, or as creative as
uploading data to server for real time plots on any website.

If this gets working in place, we can generalize this to many classes of
scikit-learn. As a part of my project, I am planning to enrich scikit-learn
to be shipping some ready made callback helpers for easy plug and play.

I am still not sure whether this is sufficient for a three months timeline,
because I am assuming the review cycles might take slightly longer time
because of scikit-learn being such a huge community. As far as the math is
concerned, I have searched for some good references, some of which are
listed below:

1. First two points will heavily rely on @mblondel's lightning package, and
this blog post: http://sebastianruder.com/optimizing-gradient-descent/

2. For the losses (third point), I have seen the way existing losses are
written in cython, as well as in the metrics submodule. That should help a
lot.

3. About the fourth point, first of all I would be happy to get some
suggestions from the community. Once satisfied, I should implement a very
basic prototype with some existing class, maybe convert verbose logging of
some class to a callback structure. Will include that in the second draft
of proposal which would be a preliminary version of what I shall submit on
GSoC website.

More about me:

1. Github Profile: https://www.github.com/karandesai-96

2. GSoC 2016 Project: https://goo.gl/mdFZ6m

3. Joblib Contributions: https://git.io/vyMSx

4. Scikit-learn Contributions: https://git.io/vyMSF

I'll be eagerly waiting for feedback. Thanks.

Regards,

Karan Desai,

Department of Electrical Engineering,

IIT Roorkee, India.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20170315/5af5a667/attachment-0001.html>