Putting it All Together

Let's combine all the processes together and see the final output.

Overall view of the result

In this chapter, we have not modified the training pipeline much. The data preparation part is roughly the same as in the previous chapter except that we performed the split using Scikit-Learn this time. The model configuration part is primarily the same as well, but we changed the loss function, so it is appropriate now for a classification problem. The model training part is quite straightforward since the development of the StepByStep class in the last chapter.

But now, after training a model, we can use our class’ predict method to get predictions for our validation set, and use Scikit-Learn’s metrics module to compute a wide range of classification metrics like the confusion matrix.

Behold your pipeline:

Data preparation

The major difference from the previous version was that we used a dataset that is already split into training and validation sets. This can also be observed in the given code below:

Get hands-on with 1200+ tech skills courses.