1 minute read

Both random forests and boosted trees are types of ensemble learning methods, which means they work by creating a set of decision trees and combining their predictions to make a final prediction. Both methods are used to correct for bias and variance in the predictions made by individual decision trees.

Bias refers to the error that is introduced when a model consistently predicts too high or too low, while variance refers to the error that is introduced when a model’s predictions vary too much between different samples of data.

In a random forest, the decision trees are created using random samples of the data, which helps to reduce bias and variance by creating a diverse set of decision trees. The final prediction is made by averaging the predictions of all of the decision trees, which helps to further reduce variance.

In a boosted tree, the decision trees are created in a way that helps the model make better predictions. To create the decision trees, the boosted tree starts by making a simple decision tree and then adds more decision trees one at a time. Each time a new decision tree is added, it focuses on the mistakes that the previous decision trees made and tries to improve upon them. This process is repeated many times until the model is accurate enough. Boosted trees are generally more accurate than random forests because they are able to effectively correct for bias and variance through this iterative process.

Updated: