Hyper-Parameter Optimization and Kaggle Competition

Hyper-parameter optimization is an important part of model evaluation and involves automated ways to choose the right hyper-parameters. You’ll discover more in this lesson.


In the Chapter on Regression, we looked at Cross-Validation and saw the intuition of using it and its benefits. We looked at the working of k-Fold Cross Validation which divided the training dataset into kk sets and trained on the k1k-1 sets and evaluated on the kthsetk^{th} set. This process is repeated for all the splits and the average performance of the model across all the splits is reported.

Stratified K-Fold Cross-Validation

Stratified k-fold Cross Validation works in the same manner as k-fold cross-validation. It has an additional benefit, which is to make sure that every set in kk sets has an almost equal distribution of classes. It is used for Classification problems. It will also be used in setting up the things for Hyper-parameter optimization.

Hyper-Parameter Optimization

Hyper-parameter optimization refers to the process of finding the optimal values of hyper-parameters like learning rate α\alpha, regularization parameter λ\lambda, depth of tree in Decision Trees, etc. Optimal values refer to the values which give us the least error in the case of Regression problems or maximum accuracy in the case of Classification problems. It is one of the important steps in Machine Learning. We have the following techniques involved in Hyper-Parameter optimization.


Babysitting is a trial and error method that involves manually setting the values of various hyper-parameters. This technique is what we have been following so far in the course. We have been manually setting parameters in all the Regression and Classification algorithms and evaluating the performance of the models.

Grid Search

Grid Search refers to finding the best hyper-parameter values by trying out values present in a Grid. It tries all the possible combinations of values specified in the Grid and returns those which give us good results.

Scikit Learn provides GridSearchCV class which implements cross-validation using each combination of parameters specified in the Grid. For the given values, GridSearchCV exhaustively considers all parameter combinations. It gives us the best combination of Hyper-parameters and the best score corresponding to those hyper-parameters.

Random Search

Random Search as the name suggests tries out random values or a random combinations of different hyper-parameters while tuning the Hyper-parameters. It gives us the best combination of Hyper-parameters and the best score corresponding to those hyper-parameters.

Get hands-on with 1200+ tech skills courses.