Data Science (Python) :: Random Forest Regression

Sunil Kumar SV
2 min readJul 4, 2017

--

Intention of this post is to give a quick refresher (thus, it’s assumed that you are already familiar with the stuff) of Random Forest Regression (using Python). You can treat this as FAQ’s as well.

What is Ensemble Learning (working behind Random Forest Regression) ? Why is it is considered as much stable when compared with decision tree?

In simple terms, it’s building many decision trees and for predicting a value, we predict using all the decision trees built and take average of the predictions of each tree and call it as ‘Prediction’.

It is considered more stable because, when we add more data points, then it minimizes the deviation, as we take average of many decision trees.

***********************************************

Sample code for implementing Random Forest Regression?

from sklearn.ensemble import RandomForestRegressor
var_regressor = RandomForestRegressor(n_estimators = 10, criterion = ‘mse’, random_state = 0)
var_regressor.fit(var_X, var_y) #var_X is the array of independant variables and var_y is an array of dependant variable

***********************************************

Whats the difference b/w Decision Tree Regression and Random Forest Regression in terms of split which happens when we plot the model?

In Random Forest Regression, as it contains many Decision Tree’s internally, the split b/w the intervals is more (Simply put, it’s like a combination of more than 1 decision tree for the split and thus, much better split). This also implies that, visually, there will more steps in the stairs.

Next :- Data Science (Python) :: R Squared

Prev :- Data Science (Python) :: Decision Tree Regression

If you liked this article, please hit the ❤ icon below

--

--

Sunil Kumar SV

#ProductManager #TechEnthusiast #DataScienceEnthusiast #LoveToSolveProblemsUsingTech #Innovation