Search⌘ K

Exercise Solution

Understand the Darknet-19 architecture used in YOLOv7, including its 19 convolutional layers, max-pooling layers, batch normalization, and activation functions. Learn how this CNN model processes inputs with sequential convolutional blocks, global average pooling, and fully connected layers for object detection tasks.

We'll cover the following...

The Darknet-19 model is a deep convolutional neural network architecture introduced in YOLOv2. It consists of 19 convolutional layers and 5 max-pooling layers.

C++
import torch
import torch.nn as nn
class Darknet19(nn.Module):
def __init__(self, num_classes=1000):
super(Darknet19, self).__init__()
# Helper function to add a convolutional block
def conv_block(in_channels, out_channels, kernel_size, stride=1, max_pool=False):
layers = [
nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding=(kernel_size-1)//2, bias=False), # Convolutional layer
nn.BatchNorm2d(out_channels), # Batch normalization
nn.LeakyReLU(0.1, inplace=True) # Leaky ReLU activation
]
if max_pool:
layers.append(nn.MaxPool2d(2, 2)) # Max pooling
return nn.Sequential(*layers)
# Darknet-19 architecture
self.model = nn.Sequential(
# Layer 1
conv_block(3, 32, 3, max_pool=True),
# Layer 2
conv_block(32, 64, 3, max_pool=True),
# Layer 3-4
conv_block(64, 128, 3),
conv_block(128, 64, 1),
# Layer 5
conv_block(64, 128, 3, max_pool=True),
# Layer 6-7
conv_block(128, 256, 3),
conv_block(256, 128, 1),
# Layer 8
conv_block(128, 256, 3, max_pool=True),
# Layer 9-11
conv_block(256, 512, 3),
conv_block(512, 256, 1),
conv_block(256, 512, 3, max_pool=True),
# Layer 12-14
conv_block(512, 1024, 3),
conv_block(1024, 512, 1),
conv_block(512, 1024, 3),
# Layer 15-16
conv_block(1024, 512, 1),
conv_block(512, 1024, 3)
)
# Final layers
self.fc = conv_block(1024, num_classes,1) # Fully connected layer
self.avgpool = nn.AdaptiveAvgPool2d((1, 1)) # Global average pooling
def forward(self, x):
x = self.model(x) # Pass through Darknet-19
x = self.avgpool(x) # Global average pooling
x = torch.flatten(x, 1) # Flatten the tensor
x = self.fc(x) # Classification layer
return x
# Create the Darknet-19 model and print its architecture
model = Darknet19()
print(model)

Explanation

The Darknet-19 architecture is a convolutional neural network designed primarily for object detection tasks. It consists of 19 convolutional layers, interspersed with max-pooling layers for spatial downsampling.

  • Lines 1–4: We import torch and ...