Gradient Boosting: Implementation Using Scikit-learn
Explore the process of testing a gradient boosting regressor by using a trained model to make predictions on test data. Learn to evaluate model accuracy by calculating mean squared error and compare your implementation with scikit-learn's GradientBoostingRegressor to understand performance differences and efficiency.
In this lesson, we’ll look into the testing phase of gradient boosting, building upon the trained model that we previously developed. Our main objective is to utilize this trained model to make predictions on a test dataset. To validate the performance of our implementation, we will compare our results with those obtained from GradientBoostingRegressor provided by the scikit-learn library.
Training of gradient boosting regressor
Before proceeding to the testing phase, we’ll consolidate all the code widgets of the previous lesson to review and understand the progress we’ve made so far. Then, we’ll evaluate the effectiveness of our trained model on unseen data.
Testing of gradient boosting regressor
We’re going to write a function named GB_predict that takes several parameters: test_data (the data on which we want to check the performance of our trained model), list_of_models (list of weak learners with trained parameters), alpha (learning rate), and c (initial predictor, which in our case is the mean of target variable). It aims to make predictions on a test dataset using the trained gradient boosting model. The function iterates over the ensemble of decision tree models, updates the predictions based on each model’s output, and returns the final predictions.
Note: The mean squared error of the gradient boosting regressor of scikit-learn after the ...