[scikit-learn] Random forest fitting very well

Brian Holt bdholt1 at gmail.com
Thu Jun 23 06:05:43 EDT 2016


Hi Muhammad,

If you've not yet read the documentation I would highly recommend starting
with the Decision Tree [1] and working your way through the examples on
your own data.  You'll find an example [2] of how to generate a graphviz
compatible dot file and visualise it.

Once your satisfied that you understand what each tree is doing with your
dataset as you vary parameters, then it makes sense to try to inject some
randomness by varying the features used in each tree or the samples (or
both [3]).

Regards,
Brian

[1] http://scikit-learn.org/stable/modules/tree.html
[2]
http://scikit-learn.org/stable/modules/generated/sklearn.tree.export_graphviz.html#sklearn.tree.export_graphviz
[3]
http://scikit-learn.org/stable/modules/generated/sklearn.ensemble.ExtraTreesClassifier.html

On 23 June 2016 at 10:20, muhammad waseem <m.waseem.ahmad at gmail.com> wrote:

> Hi All,
> I am trying to use random forests for a regression problem, with 10 input
> variables and one output variable. I am getting very good fit even with
> default parameters and low n_estimators. Even with n_estimator = 10, I get
> R^2 value of 0.95 on testing dataset (MSE=23) and a value of 0.99 for
> the training set. I was wondering, if this is common with random forest or
> I am missing something, Could you please share your experience? The total
> number of sample (training +testing) are equal to 10971.
> Also, what are the most important parameters (max_depth, bootstrap,
> max_leaf_nodes etc.) that I need to play with to tune my model even
> further? Lastly, is there is a way I can visualise a single tree of my
> forest (just for demonstration purposes)?
> Please see a figure below to demonstrate how well it is fitting with
> default values.
>
>
>
> [image: Inline image 1]
> Thanks
> Kindest Regards
> Waseem
>
> _______________________________________________
> scikit-learn mailing list
> scikit-learn at python.org
> https://mail.python.org/mailman/listinfo/scikit-learn
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20160623/02f73406/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: forest fitting.png
Type: image/png
Size: 86146 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/scikit-learn/attachments/20160623/02f73406/attachment-0001.png>


More information about the scikit-learn mailing list