Handling Image Data
Explore questions about handling image data as they can be crucial for interview preparation in data science and machine learning roles.
Image data is everywhere—whether it's medical scans, social media photos, or product images in e-commerce. Knowing how to process, filter, and augment this kind of data is a key skill for data scientists and machine learning engineers. In this lesson, we’ll build a Gaussian blur filter from scratch and apply common data augmentations in Python. Let’s get started.
Implement a simple Gaussian filter
You’re working on an image preprocessing pipeline for a computer vision application. One of the tasks involves blurring images using a Gaussian filter to reduce noise and improve feature extraction downstream.
In this challenge, you’re asked to implement a 2D Gaussian kernel from scratch using NumPy—no high-level libraries like OpenCV or SciPy are allowed.
import numpy as npimport matplotlib.pyplot as pltdef create_gaussian_kernel(size, sigma):"""Create a 2D Gaussian kernel.Args:size: Kernel size (should be odd)sigma: Standard deviation of Gaussian distributionReturns:2D numpy array containing the Gaussian kernel"""return kerneldef apply_gaussian_filter(image, kernel_size=5, sigma=1.0):"""Apply Gaussian filter to an image.Args:image: Input image (2D numpy array)kernel_size: Size of the Gaussian kernelsigma: Standard deviation of Gaussian distributionReturns:Filtered image"""# TODO - Create kernel# TODO - Pad image# TODO - Create output image# TODO - Apply convolutionreturn filtered_image# Example usageif __name__ == "__main__":# Create a sample image (100x100 with a white square in the middle)image = np.zeros((100, 100))image[40:60, 40:60] = 1.0# Add some noisenoisy_image = image + np.random.normal(0, 0.1, image.shape)# Apply Gaussian filterfiltered_image = apply_gaussian_filter(noisy_image, kernel_size=5, sigma=1.0)# Plot results if matplotlib is availablefig, (ax1, ax2) = plt.subplots(1, 2, figsize=(10, 5))ax1.imshow(noisy_image, cmap='gray')ax1.set_title('Noisy Image')ax2.imshow(filtered_image, cmap='gray')ax2.set_title('Filtered Image')plt.savefig('output/graph.png')
Sample answer
Here’s how we can proceed with this:
Define the grid: Start by creating a square grid centered around 0. For a
k x k
kernel, usenp.linspace
ornp.arange
to form the x and y axes.Apply the Gaussian formula: Use the 2D Gaussian function and vectorize this using NumPy to apply it across the meshgrid.
Normalize the kernel: Ensure that the sum of all weights in the kernel is ...