Random forest, or random decision forest, is a supervised machine learning algorithm that relies on ensemble learning for classification and regression. It is a flexible, straightforward algorithm that produces excellent results most of the time (even without hyper-parameter tuning). It is also one of the most used algorithms.
Random forest is an ensemble learning method – the ‘forest’ is an ensemble of decision trees. Creating a multitude of decision trees at training time and then taking a mode (classification) or mean (regression) of the output helps to obtain better predictive performance than could be obtained from any of the decision trees alone.
Not sure what decision trees are? Check this shot.
Random forests correct the decision trees’ habit of overfitting to their training set.
Suppose we have a random forest that has been trained to predict the weather.
This random forest will have multiple decision trees that each classify the input data and output a prediction. We then take a majority vote (mode) of the output, and we have our final prediction.
In the above example, the majority of decision trees predicted the weather as “rainy.” Therefore, our final prediction is “rainy.”
Through this approach, an error made by a few decision trees is covered up by other decision trees.
View all Courses