How to use a pre-trained deep learning model

In this article, we will use a pre-trained model for image classification. This strategy is just one of the three strategies mentioned in my previous articles to implement transfer learning techniques. If you haven’t checked them out yet, I would suggest reading them first to understand transfer learning. Go to these,

and understand the idea and strategies behind transfer learning.

Points to consider before moving on to the implementation

Some questions may arise while you’re implementing any of the transfer learning strategies:

When do you fine-tune?
How do you fine-tune?
How do you decide what type of transfer learning you should perform on a new dataset?

The answer to the above questions depends mostly on two factors i.e., the size of the new dataset and its similarity to the original dataset on which your pre-trained model was trained.

I have listed four common rules that you should keep in mind while using transfer learning strategies. These are:

New dataset is large and very different from the original dataset - Since the dataset is large, we can go ahead and train the ConvNet from scratch, but it would take some time to train a complex network. So, in this case, it is common to use a fine-tuned approach. It may be the case that fine-tuning the ConvNet won’t produce good results. This is due to the fact that the new dataset is very different from the original one. In that case, you can train the ConvNet from scratch.
New dataset is large and similar to the original dataset - Again, the dataset is large and we can use the fine-tune approach to save the time taken in training the network from scratch. The model would give good results as the new dataset is very similar to the original one.
New dataset is small and similar to the original dataset - As the dataset is small, fine-tuning may lead to over-fitting of the model. So, it is advisable to use ConvNet as a fixed feature detector, add your own classifier (the dense layers) on top of it, and just train that classifier only.
New dataset is small but very different from the original dataset - Since the dataset is small, you can use the above-mentioned strategy. But if the dataset is very different from the original one, then it is advisable to fine-tune the ConvNet, but make sure not to go too deep in the network and try to adjust the weights of a small number of layers only.

There are few additional things to keep in mind while you perform transfer learning:

Learning rates: It is advisable to use small learning rates while fine-tuning the ConvNet model. This is because we expect that the model is well- trained and that it may not be good to distort the weights too fast and too much as that could result in a very poor model.
Constraints from pre-trained models: You should take the constraints into account before you actually work with a model. You should be familiar with the input format which the model has taken to be trained i.e., the format of the original dataset.

Now, we are good to dive into the coding part. In this article, we will be implementing the first strategy i.e., using a pre-trained model to work on an image classification problem.

Once you run the above code, you will get an output similar to this.

[('n01871265', 'tusker', 0.5334062), ('n02504458', 'African_elephant', 0.28311166), ('n02504013', 'Indian_elephant', 0.18275209)]

Note that you may get a different output based on the image you have used to predict the labels.

If you take an image of let’s say, “a man in turban” and want to see what the label will be, you may get an output like this:

[('n04350905', 'suit', 0.3715849), ('n10148035', 'groom', 0.14189447), ('n04591157', 'Windsor_tie', 0.090490855)]

The output is not up to the mark as no label has the probability value higher than 50%. This happens due to the fact that the image we used may be very different from the original dataset on which ResNet50 model was trained.

How to use a pre-trained deep learning model

Points to consider before moving on to the implementation

Import the required libraries

Load the model

Load the image

Make the predictions