Cross-Validation
Explore the concept of cross-validation in supervised learning to enhance model tuning and evaluation. Understand how dividing data into folds allows for better use of training data, multiple validation iterations, and reliable testing. This lesson helps you apply k-fold cross-validation to improve the accuracy and robustness of decision tree and ensemble models in R.
We'll cover the following...
Supervising the data
When using supervised learning, think of the data scientist as a teacher and the machine (e.g., a laptop) as a student. As a teacher supervises students’ learning, the data scientist supervises the machine learning process.
The goal is to teach students in the most effective way possible. Given the many teaching techniques available, how does a teacher know which are successful?
In a word—testing.
However, good teachers don’t jump into testing. Good teachers provide students with opportunities to practice what they have learned.
Let’s say a teacher has developed a bank of 100 questions with answers that can help students practice and evaluate their learning via testing. The following image provides a couple of examples of how these ...