Numeric Feature SSE
Explore how the CART algorithm efficiently determines optimal split points for numeric features in regression trees by calculating the sum of squared errors (SSE). Understand this process through an example using the Titanic dataset and the Fare feature. This lesson helps you grasp how numeric splits are evaluated and selected to improve the accuracy of regression models in R.
We'll cover the following...
Finding the optimal split
Similar to classification trees, the CART algorithm has to find optimal regression tree splits efficiently. In the case of numeric features, this is a challenge because numeric features have many unique values. Once again, the CART algorithm utilizes an optimization to evaluate potential split points efficiently.
Consider training a CART regression tree model to impute missing values of the Age ...