[scikit-learn] Probabilities for LogisticRegression and LDA

Thu Feb 7 11:29:50 EST 2019

I was earlier looking at the code of predict_proba of LDA and
LogisticRegression. While we certainly some bugs I was a bit confused and I
thought an email would be better than opening an issue since that might not
be one.

In the case of multiclass classification, the probabilities could be
computed with two different assumptions - either as a set of independent
binary regression or as a log-linear model (
https://en.wikipedia.org/wiki/Multinomial_logistic_regression).

Then, we can compute the probabilities either by using a class as a pivot
and computing exp(beta_c X) / 1 + sum(exp(beta_k X)) or using all classes
and computing a softmax.

My question is related to the LogisticRegression in the OvR scheme.
Naively, I thought that it was corresponding to the former case (case of a
set of independent regression). However, we are using another normalization
there which was first implemented in liblinear. I search on liblinear's
issue tracker and found: https://github.com/cjlin1/liblinear/pull/20

It is related to the following paper:
https://www.csie.ntu.edu.tw/~cjlin/papers/generalBT.pdf

My skill in math is limited and I am not sure to grasp what is going on?
Anybody could bring some lights on this OvR normalization and why is it
different from the case of a set of independent regression describe in
Wikipedia?

Cheers,
-- 
Guillaume Lemaitre
INRIA Saclay - Parietal team
Center for Data Science Paris-Saclay
https://glemaitre.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190207/e02eef2b/attachment.html>