Significant contributions in the field of image recognition and classification have been empowered by convolutional neural networks and their improved architecture and increased
However, these advancements do not imply that the techniques can be employed in time efficient settings as there exists a tradeoff between computational cost and efficiency of the model.
MobileNet models take advantage of the property that many of the model parameters in trained models are redundant and capture the same features within the images. Hence, there is significant room for pruning and reducing the number of parameters to produce much more time efficient models that can be used on lightweight, and computationally restrained devices such as mobile phones. This is done precisely via depth-wise-separable filters.
Standard convolution techniques attempt to filter the features and then merge them into a new output all in one step.
For
Note: It follows that the computational cost grows multiplicatively with the number of input channels, number of output channels, dimensions of the input feature map and the kernel dimensions.
This computation is articulated into two steps by depth-wise separable filters:
The sequential chaining is illustrated in the diagram below:
The computational cost for depth-wise convolution can be computed using the following equation, where
Like depth-wise convolution, point-wise convolution has a computational cost, shown in the following equation, where
Therefore, the total cost can be expressed as the sum of the costs of depth-wise convolution and point-wise convolution:
An example of classification for the input image is shown below:
import numpy as npimport kerasfrom keras import backend as Kfrom keras.preprocessing import imagefrom keras.applications import imagenet_utilsdef preprocess_image(input_file_name):image_loaded = keras.utils.load_img(input_file_name, target_size=(224, 224))image_array = keras.utils.img_to_array(image_loaded)image_array_padded = np.expand_dims(image_array, axis=0)return keras.applications.mobilenet.preprocess_input(image_array_padded)mobile = keras.applications.mobilenet.MobileNet()def display_results(arr):arr_pred = arr[0]i = 0while i < len(arr_pred):tuple_pred = arr_pred[i]print(tuple_pred[1],'--->', tuple_pred[2])i = i+1preprocessed_image = preprocess_image('input.jpg')predicted_output = mobile.predict(preprocessed_image)results_decoded = imagenet_utils.decode_predictions(predicted_output)print("The predicted labels along with their respective probabilities are as follows:")display_results(results_decoded)
MobileNet()
that is a pre-trained classifier.MobileNet()
's prediction function to render an appropriate prediction for the input image and store the possible image labels along with their relative probabilities in the results.Free Resources