Decision Trees

Explore how decision trees capture nonlinear relationships in data and apply both classification and regression in Python. Learn key concepts such as root nodes, splitting, pruning, and evaluating model accuracy with sklearn's DecisionTreeClassifier on real-world datasets.

We'll cover the following...

Decision trees

Decision trees in Python

Decision trees

In the first lesson of this chapter, we talked about how linear regression models focus only on linear relationships between the dependent and independent variables; they fail to capture nonlinear relationships. Decision trees are made to capture nonlinear relationships.

Decision trees model data as a tree of hierarchical branches. It is a flowchart-like structure in which each internal node represents a test on an attribute (e.g. whether a coin flip comes up heads or tails), each branch represents the outcome of the test, and each leaf node represents a class label (decision taken after computing all attributes). The paths from the root to the leaf represent classification rules. Decision Trees can adapt to both regression and classification tasks.

Common terms used with Decision trees:

Root node: It represents the entire population or sample, and this further gets divided into two or more homogeneous sets.
Splitting: It is a process of dividing a node into two or more sub-nodes.
Decision node: When a sub-node splits into further sub-nodes, then it is called a decision node.
Leaf/Terminal node: Nodes that do not split are called Leaf or Terminal node.
Pruning ...

1.What is Data Science

2.Python Basics

3.Handling Tabular Data in Python

4.Data Cleaning

5.Exploratory Data Analysis

6.Statistical Inference

7.Predictive Models

8.Machine Learning

Project

Mock Interview

Decision Trees

Decision trees