Search⌘ K

XGBoost Basics

Explore the essentials of using XGBoost for machine learning by understanding its core data structure, the DMatrix, and how to train and evaluate Booster objects. This lesson helps you implement gradient boosted decision trees for multiclass classification with practical coding exercises.

Chapter Goals:

  • Learn about the XGBoost data matrix
  • Train a Booster object in XGBoost

A. Basic data structures

The basic data structure for XGBoost is the DMatrix, which represents a data matrix. The DMatrix can be constructed from NumPy arrays.

The code below creates DMatrix objects with and without labels.

Python 3.5
data = np.array([
[1.2, 3.3, 1.4],
[5.1, 2.2, 6.6]])
import xgboost as xgb
dmat1 = xgb.DMatrix(data)
labels = np.array([0, 1])
dmat2 = xgb.DMatrix(data, label=labels)

The DMatrix object can be used for training and using a ...