[scikit-learn] what is "value" in the nodes of trees in a gbm?

Tue Oct 31 09:48:45 EDT 2023

Hi Nicolas,

Thank you so much for the links and explanation. I really appreciate it.

I am struggling to reproduce the results though. There's probably something I don't understand.

This is an image of the top node, of the first tree in the ensemble (GradientBoostingRegressor):

[Screenshot 2023-10-31 at 14-39-06 4-gbm-local - Jupyter Notebook.png]

How can I manually obtain the values for squared_error and value?

I thought square_error would be:

np.mean( (y_train - 0.1 * gbm.estimators_[0][0].predict(X_train))**2)

And value would be:

-2 * (y_train - 0.1 * gbm.estimators_[0][0].predict(X_train))

But those calculations do not return the numbers shown in the node.

Is there something obvious that I am doing wrong?

Thanks a lot!

Best
Sole

Sent with [Proton Mail](https://proton.me/) secure email.

------- Original Message -------
On Monday, October 30th, 2023 at 5:34 PM, Nicolas Hug <niourf at gmail.com> wrote:

> The node values in GBDTs are an aggregation (typically a regularized average) of the gradients of the samples in that node.
>
> Each sample (x, y) is associated with a gradient computed as grad = d_loss(pred(x), y) / d_pred(x). These gradients are in the same physical dimension as the target (for regression). Some resources that may help:
>
> - https://explained.ai/gradient-boosting/descent.html
> - https://nicolas-hug.com/blog/gradient_boosting_descent (self plug)
>
> Nicolas
>
> On 30/10/2023 16:09, Sole Galli via scikit-learn wrote:
>
>> Hello everyone,
>>
>> I am trying to interpret the outputs of gradient boosting machines sample per sample.
>>
>> What does the "value" in each node of each tree in a gbm regressor mean?
>>
>> [Untitled.png]
>>
>> In random forests, value is the mean target value of the observations seen at that node. At the top node it is usually the mean target value of the train set (or bootstrapped sample). As it goes down the leaves it is the mean target value of the samples at each child.
>>
>> But in gradient boosting machines it is different. And I can't decipher how it is calculated.
>>
>> I expected the value in the first tree at the top node to be zero, because the residuals of the first tree are zero. But it is not exactly zero.
>>
>> In summary, how is the value at each node / tree calculated?
>>
>> Thanks a lot!!!
>>
>> Warm regards,
>> Sole
>>
>> Sent with [Proton Mail](https://proton.me/) secure email.
>>
>> _______________________________________________
>> scikit-learn mailing list
>> scikit-learn at python.org
>>
>> https://mail.python.org/mailman/listinfo/scikit-learn
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20231031/87c99b19/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Untitled.png
Type: image/png
Size: 493231 bytes
Desc: not available
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20231031/87c99b19/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot 2023-10-31 at 14-39-06 4-gbm-local - Jupyter Notebook.png
Type: image/png
Size: 39833 bytes
Desc: not available
URL: <https://mail.python.org/pipermail/scikit-learn/attachments/20231031/87c99b19/attachment-0003.png>