Random Forest

Today, I’ll be covering the last classifier — Random Forest. There are many more — but we’ll cover them as and when we use them in projects.

#100DaysOfMLCode #100ProjectsInML

We have already discussed about Random Forest in Project 6. It is a version of Ensemble learning where you take an algorithm or multiple algorithms and apply it multiple times to make it more powerful than the original version. In case of Random Forest, it combines lots of Decision Trees. Instead of running the Decision Tree algorithm once, we will be running it multiple times.

The way it works is:

  1. We pick K number of data points randomly from the training set.
  2. We then build the decision tree using these K data points. Instead of building one decision tree for all the data points in the training set — we use a random subset of data and build a decision tree on that subset.
  3. Then we decide on number of trees N, we want to build, like 10 or 50 and repeat Step 1 and Step 2.
  4. Now when we have a new data point, we get each of the N trees to predict which category that new data point falls under and then assign that new point to the category that gets maximum number of counts.

--

--

Omair Aasim

Passionate about building products — An advocate of AI, a software engineer by profession — an entrepreneur at heart and a sports enthusiast.