[scikit-learn] Difference in normalization between Lasso and LogisticRegression + L1

Andreas Mueller t3kcit at gmail.com
Wed May 29 13:48:42 EDT 2019


That is not very ideal indeed.
I think we just went with what liblinear did, and when saga was 
introduced kept that behavior.
It should probably be scaled as in Lasso, I would imagine?


On 5/29/19 1:42 PM, Michael Eickenberg wrote:
> Hi Jesse,
>
> I think there was an effort to compare normalization methods on the 
> data attachment term between Lasso and Ridge regression back in 
> 2012/13, but this might have not been finished or extended to Logistic 
> Regression.
>
> If it is not documented well, it could definitely benefit from a 
> documentation update.
>
> As for changing it to a more consistent state, that would require 
> adding a keyword argument pertaining to this functionality and, after 
> discussion, possibly changing the default value after some deprecation 
> cycles (though this seems like a dangerous one to change at all imho).
>
> Michael
>
>
> On Wed, May 29, 2019 at 10:38 AM Jesse Livezey 
> <jesse.livezey at gmail.com <mailto:jesse.livezey at gmail.com>> wrote:
>
>     Hi everyone,
>
>     I noticed recently that in the Lasso implementation (and docs),
>     the MSE term is normalized by the number of samples
>     https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Lasso.html
>
>     but for LogisticRegression + L1, the logloss does not seem to be
>     normalized by the number of samples. One consequence is that the
>     strength of the regularization depends on the number of samples
>     explicitly. For instance, in Lasso, if you tile a dataset N times,
>     you will learn the same coef, but in LogisticRegression, you will
>     learn a different coef.
>
>     Is this the intended behavior of LogisticRegression? I was
>     surprised by this. Either way, it would be helpful to document
>     this more clearly in the Logistic Regression docs (I can make a PR.)
>     https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html
>
>     Jesse
>     _______________________________________________
>     scikit-learn mailing list
>     scikit-learn at python.org <mailto:scikit-learn at python.org>
>     https://mail.python.org/mailman/listinfo/scikit-learn
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20190529/506e2d52/attachment.html>


More information about the scikit-learn mailing list