Testing the Titanic Dataset

Apply what you've learned about the random forest algorithm to the Titanic test dataset.

Profiling the Titanic test dataset

It’s best practice to profile the test dataset before making predictions for final testing. The goal of profiling the test dataset is only to uncover potential problems in preparing the test dataset for predictions. Any time the test dataset is examined, there is the risk of information leakage. We have to be careful to prevent leakage.

The following code uses the skimr package to profile the Titanic test dataset:

Get hands-on with 1200+ tech skills courses.