Build a digit classifier in R

R is a programming language mainly used for statistical computing and data analysis. It offers numerous libraries and packages to facilitate machine learning operations, such as digit classification. Digit classification is a fundamental task in computer vision and machine learning. A model must be trained to identify and categorize handwritten digits into their corresponding values.

Getting started

This Answer will walk us through creating a digit classifier in R using the MNIST datasetThe MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems..

Data preparation

A key process in any data analysis process is data preparation, which makes sure the data is ready for analysis. After gathering the dataset, the first step is to prepare and upload the data into the working environment. As mentioned earlier, we’ll be using the MNIST dataset, which contains grayscale images of handwritten digits (0–9). Let’s load the dataset and split it into training and testing sets:

mnist <- dataset_mnist()
x_train <- mnist$train$x
y_train <- mnist$train$y
x_test <- mnist$test$x
y_test <- mnist$test$y
Loading and splitting the MNIST dataset

Data preprocessing

Data preprocessing ensures that the data we use for machine learning or analysis is clean, consistent, and error-free. It’s important to preprocess the data before training a model or analyzing data, as it directly impacts the model’s performance or analysis. Normalization is a preprocessing technique that scales the data to a range between 0 and 1.

x_train <- x_train / 255
x_test <- x_test / 255
Normalization of dataset

Build the model

Once the data is ready for training, we need to define the architecture of our model to train our data on and use it later for classification. We’ll create a simple convolutional neural network (CNN) for digit classification as it’s well-suited to handle the image data. A CNN consists of the following layers:

  • Input

  • Convolutional

  • Max-pooling

  • Fully connected

  • Output layer

Below is a code snippet that shows how to build a basic CNN model in R using Keras:

model <- keras_model_sequential()
model %>%
layer_conv_2d(filters = 32, kernel_size = c(3, 3), activation = 'relu', input_shape = c(28, 28, 1))
model %>%
layer_max_pooling_2d(pool_size = c(2, 2))
model %>%
layer_flatten()
model %>%
layer_dense(units = 128, activation = 'relu')
model %>%
layer_dense(units = 10, activation = 'softmax')
Defining the model

Compile the model

Once the model is defined, the next step is to compile it by specifying the loss function, optimizer, and evaluation metric:

model %>% compile(
loss = 'sparse_categorical_crossentropy',
optimizer = 'adam',
metrics = c('accuracy')
)
Compiling the model

Model training

The next step is to train the model on training data, and we specify the number of epochs and batch size for the model.

epochs <- 10
batch_size <- 64
history <- model %>% fit(
x_train, y_train,
epochs = epochs,
batch_size = batch_size,
validation_split = 0.2
)
Training the model

Model evaluation

The final step after training our model for digit classification is to evaluate our model on test data. The code is given below:

eval_result <- model %>% evaluate(x_test, y_test)
cat("Test accuracy:", eval_result$accuracy, "\n")
Evaluation and predictions of the model

Parameter selection

  • Softmax function: It’s often used in the neural network’s output layer to handle multi-class classification problems. We used the MNIST dataset, which contains 10 classes from 0 to 9, to classify digits. It predicts the probability of occurrence of each class, and the sum of probabilities for all classes is equal to 1.

  • Adam optimizer: The reason to use the Adam optimizer is that it produces faster computation times, requires fewer tuning parameters, and performs better. It’s the default optimizer for most of the classification problems.

  • Sparse categorical cross entropy: It’s suitable for multi-class classification where the target labels are integers. It’s an extension of the cross entropy loss function used for binary classification problems.

  • Evaluation metric: We used accuracy as an evaluation metric because it provides a simple and intuitive way to measure the performance of the model. It’s a commonly used metric in classification problems. We can use other metrics as per our requirements.

  • Epoch: It refers to the training dataset passed through the neural network during training. Underfitting or overfitting can be avoided by selecting the appropriate number of epochs. In our example, we set epoch=5 because we have little computational power, and we also achieved good accuracy.

  • Batch size: It refers to the number of samples passed to the network before updating model parameters. There is a trade-off between accuracy and speed: Large batch sizes can lead to faster training times but might result in lower accuracy, while smaller batch sizes can provide better accuracy but can be time-consuming. In our example, we used batch_size=32 and got good results.

Complete code

The complete code of the digit classifier in R is given below. Press the “Run” button to train a model on the MNIST dataset and find its accuracy.

# Install and load necessary libraries
library(keras)
library(reticulate)


# Install TensorFlow
py_install("tensorflow")


# Load the MNIST dataset
mnist <- dataset_mnist()

# Split the data into training and testing sets
x_train <- mnist$train$x
y_train <- mnist$train$y
x_test <- mnist$test$x
y_test <- mnist$test$y

# Normalize the pixel values to be in the range [0, 1]
x_train <- x_train / 255
x_test <- x_test / 255

# Create a neural network model
model <- keras_model_sequential() %>%
  layer_flatten(input_shape = c(28, 28)) %>%
  layer_dense(units = 128, activation = 'relu') %>%
  layer_dense(units = 10, activation = 'softmax')

# Compile the model
model %>% compile(
  loss = 'sparse_categorical_crossentropy',
  optimizer = optimizer_adam(),
  metrics = c('accuracy')
)

# Train the model
model %>% fit(x_train, y_train, epochs = 5, batch_size = 32)

# Evaluate the model
evaluation <- model %>% evaluate(x_test, y_test)
print(evaluation)
cat(evaluation)

Complete coding example

Copyright ©2024 Educative, Inc. All rights reserved