[scikit-learn] what is "value" in the nodes of trees in a gbm?

Guillaume Lemaître g.lemaitre58 at gmail.com
Tue Oct 31 10:51:00 EDT 2023


You probably want to look at the following example section:

https://scikit-learn.org/stable/auto_examples/ensemble/plot_gradient_boosting_regression.html#plot-training-deviance

On Tue, 31 Oct 2023 at 14:52, Sole Galli via scikit-learn <
scikit-learn at python.org> wrote:

> Hi Nicolas,
>
> Thank you so much for the links and explanation. I really appreciate it.
>
> I am struggling to reproduce the results though. There's probably
> something I don't understand.
>
> This is an image of the top node, of the first tree in the ensemble
> (GradientBoostingRegressor):
>
> [image: Screenshot 2023-10-31 at 14-39-06 4-gbm-local - Jupyter
> Notebook.png]
>
>
> How can I manually obtain the values for squared_error​ and value​?
>
> I thought square_error​ would be:
>
> np.mean( (y_train - 0.1 * gbm.estimators_[0][0].predict(X_train))**2)
>
> And value​ would be:
>
> -2 * (y_train - 0.1 * gbm.estimators_[0][0].predict(X_train))
>
> But those calculations do not return the numbers shown in the node.
>
> Is there something obvious that I am doing wrong?
>
> Thanks a lot!
>
> Best
> Sole
>
> Sent with Proton Mail <https://proton.me/> secure email.
>
> ------- Original Message -------
> On Monday, October 30th, 2023 at 5:34 PM, Nicolas Hug <niourf at gmail.com>
> wrote:
>
> The node values in GBDTs are an aggregation (typically a regularized
> average) of the *gradients *of the samples in that node.
>
> Each sample (x, y) is associated with a gradient computed as grad =
> d_loss(pred(x), y) / d_pred(x). These gradients are in the same physical
> dimension as the target (for regression). Some resources that may help:
>
>
> - https://explained.ai/gradient-boosting/descent.html
> - https://nicolas-hug.com/blog/gradient_boosting_descent (self plug)
> Nicolas
>
> On 30/10/2023 16:09, Sole Galli via scikit-learn wrote:
>
> Hello everyone,
>
> I am trying to interpret the outputs of gradient boosting machines sample
> per sample.
>
> What does the "value" in each node of each tree in a gbm regressor mean?
>
> [image: Untitled.png]
>
> In random forests, value is the mean target value of the observations seen
> at that node. At the top node it is usually the mean target value of the
> train set (or bootstrapped sample). As it goes down the leaves it is the
> mean target value of the samples at each child.
>
> But in gradient boosting machines it is different. And I can't decipher
> how it is calculated.
>
> I expected the value in the first tree at the top node to be zero, because
> the residuals of the first tree are zero. But it is not exactly zero.
>
> In summary, *how is the value at each node / tree calculated?*
>
> Thanks a lot!!!
>
> Warm regards,
> Sole
>
>
> Sent with Proton Mail <https://proton.me/> secure email.
>
> _______________________________________________
> scikit-learn mailing listscikit-learn at python.orghttps://mail.python.org/mailman/listinfo/scikit-learn
>
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>


-- 
Guillaume Lemaitre
Scikit-learn @ Inria Foundation
https://glemaitre.github.io/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20231031/2aec96fb/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Untitled.png
Type: image/png
Size: 493231 bytes
Desc: not available
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20231031/2aec96fb/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot 2023-10-31 at 14-39-06 4-gbm-local - Jupyter Notebook.png
Type: image/png
Size: 39833 bytes
Desc: not available
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20231031/2aec96fb/attachment-0003.png>


More information about the scikit-learn mailing list