Search⌘ K
AI Features

Crafting the Recipe

Explore how to use the recipes package in R to prepare and transform data for training classification trees. Learn to convert features into factors and specify prediction formulas, enabling effective decision tree models using Titanic dataset.

We'll cover the following...

Getting started

This lesson uses the recipes package of the tidymodels family as the start of the R code script to train a CART classification decision tree. The code in this lesson is developed step by step. Not surprisingly, the first step is to load the required R packages and the Titanic training set.

Run the following code to get things started:

R
#================================================================================================
# Load libraries - suppress messages
#
suppressMessages(library(tidyverse))
suppressMessages(library(tidymodels))
suppressMessages(library(rattle))
#================================================================================================
# Load the Titanic training data and transform Embarked to a factor
#
titanic_train <- read_csv("titanic_train.csv", show_col_types = FALSE) %>%
mutate(Sex = factor(Sex),
Embarked = factor(case_when(
Embarked == "C" ~ "Cherbourg",
Embarked == "Q" ~ "Queenstown",
Embarked == "S" ~ "Southampton",
is.na(Embarked) ~ "missing")))
# Check out the data
summary(titanic_train)

This code script loads the following R packages:

  • tidyverse: The family of packages that provide functionality for acquiring, wrangling, and visualizing data.

  • tidymodels: The family of packages extending the tidyverse with functionality for machine learning.

  • rattle: A package providing a graphical user interface (GUI) for R. In this course, rattle is used to visualize decision trees. ...