...

/

Challenge Solution Review

Challenge Solution Review

In this lesson, we explain the solution to the last challenge lesson.

We'll cover the following...
Python 3.5
Saved
import sklearn.datasets as datasets
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.ensemble import GradientBoostingClassifier
X, y = datasets.load_breast_cancer(return_X_y=True)
train_x, test_x, train_y, test_y = train_test_split(X,
y,
test_size=0.2,
random_state=42)
gb = GradientBoostingClassifier(random_state=10)
param_grid = [{
"n_estimators": [1, 2, 4, 16, 32],
"learning_rate": [0.05, 0.1, 0.2, 0.4],
"min_samples_leaf": [1, 2, 4, 8],
}]
cv = GridSearchCV(gb, param_grid=param_grid, scoring="f1", n_jobs=4)
cv.fit(train_x, train_y)
print("The best F1-score is {}.".format(cv.best_score_))
print("The parameter of best estimator is {}.".format(cv.best_params_))

First, we use load_breast_cancer to load the breast cancer dataset at line 5. We split it into two parts at line 7, where the test set accounts for 20%.

A ...