Decision Trees
Decision Trees algorithms are versatile and easy to understand models in Machine Learning. It makes a model by learning decision rules from the underlying dataset. You will learn more in this lesson.
Decision Trees
Decision Trees are powerful and provide an output that domain experts and practitioners can easily understand. Decision Trees provide the basis for many Ensemble Methods, which involve using multiple models for inference and producing the output for the datasets at hand.
Decision Tree, as the name suggests, is constructed in a Tree manner, including a root node, internal nodes, and leaf nodes. Leaf nodes, also known as terminal nodes, give us the class of the instances falling in that terminal node, and the goal is to have homogeneous terminal nodes. Root Node refers to all the instances in the dataset. Interior nodes partition the set of instances. Once created, a tree can be navigated with a new row of data following each branch with the splits until a final prediction is made.
The above Decision Tree distinguishes between males and females. The Decision Tree displayed above can be represented in the form of if statements as seen below.
If Height > 180 cm Then Male
If Height <= 180 cm AND Weight > 80 kg Then Male
If Height <= 180 cm AND Weight <= 80 kg Then Female
Decision Tree algorithms
ID3
ID3 stands for the Iterative Dichotomiser 3 algorithm. It was developed by Ross Quinlan in 1986, and it was a predecessor of algorithms like C4.5. The algorithm works by finding categorical features for each node in the tree, which gives us the maximum information gain for the categorical targets. Trees grow to their maximum size, exhausting all the features. We perform the pruning step in Decision Trees to help the Trees generalize well on the unseen dataset.
C4.5
The C4.5 algorithm is a successor to the ID3 algorithm. It removed the restriction that features should be categorical to build the tree, unlike the ID3 algorithm. This algorithm is used by partitioning the continuous ...