[scikit-learn] Opinion on reference mentioning that RF uses weak learners

Sun Aug 16 15:57:36 EDT 2020

In my opinion the reference is distorting a concept that has a consolidated
definition in the community. I am also familiar with the definition of WL
as "an estimator slightly better than guessing", mostly decision stumps (
https://en.m.wikipedia.org/wiki/Decision_stump), which is not an component
of RFs.

On Sun, Aug 16, 2020, 16:22 Nicolas Hug <niourf at gmail.com> wrote:

> As previously mentioned, a "weak learner" is just a learner that barely
> performs better than random. It's more common in the context of boosting,
> but I think weak learning predates boosting, and the original RF paper by
> Breiman does make reference to "weak learners":
>
> It's interesting that Forest-RI could produce error rates not far above
> the Bayeserror rate. The individual classifiers are weak. For F=1, the
> average tree errorrate is 80%; for F=10, it is 65%; and for F=25, it is
> 60%. Forests seem to have theability to work with very weak classifiers
> as long as their correlation is low
>
> Nicolas
>
>
> On 8/16/20 2:29 PM, Guillaume Lemaître wrote:
>
> One needs to define what is the definition of weak learner.
>
> In boosting, if I recall well the literature, weak learner refers to
> learner which unfit performing slightly better than a random learner. In
> this regard, a tree with shallow depth will be a weak learner and is used
> in adaboost or gradient boosting.
>
> However, in random forest the tree used are trees that overfit (deep tree)
> so they are not weak for the same reason. However, one will never be able
> to do what a forest will do with a single tree. In this regard, a single
> tree is weaker than the forest. However, I never read the term for "weak
> learner" in the context of the random forest.
>
> Sent from my phone - sorry to be brief and potential misspell.
> *From:* fernando.wittmann at gmail.com
> *Sent:* 16 August 2020 20:06
> *To:* scikit-learn at python.org
> *Reply to:* scikit-learn at python.org
> *Subject:* [scikit-learn] Opinion on reference mentioning that RF uses
> weak learners
>
> Hello guys,
>
> The the following reference states that Random Forests uses weak learners:
> -
> https://blog.citizennet.com/blog/2012/11/10/random-forests-ensembles-and-performance-metrics#:~:text=The%20random%20forest%20starts%20with,corresponds%20to%20our%20weak%20learner.&text=Thus%2C%20in%20ensemble%20terms%2C%20the,forest%20is%20a%20strong%20learner
>
> The random forest starts with a standard machine learning technique called
>> a “decision tree” which, in ensemble terms, corresponds to our weak learner.
>
> ...
>
>  Thus, in ensemble terms, the trees are weak learners and the random
>> forest is a strong learner.
>
>
> I completely disagree with that statement. But I would like the opinion of
> the community to double check if I am not missing something.
>
>
> _______________________________________________
> scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20200816/970a0a5f/attachment.html>