Search⌘ K
AI Features

Architectural Components

Explore the architecture of self-driving car systems, focusing on how visual understanding components like CNNs and semantic image segmentation inform real-time action prediction using RNNs or LSTMs. Understand the training and prediction flows including data annotation, augmentation, and transfer learning.

Overall architecture for self-driving vehicle

Let’s discuss a simplified, high-level architecture for building a self-driving car. Our discussion entails the important learning problems to be solved and how different learning models can fit together, as shown below.

High-level architecture (CNN: Convolutional neural network, RNN: Recurrent neural network, LSTM: Long short-term memory)
High-level architecture (CNN: Convolutional neural network, RNN: Recurrent neural network, LSTM: Long short-term memory)

The system is designed to receive sensory inputs via cameras and radars, which are fed to the visual understanding system consisting of different convolutional neural networks (CNN), each for a specific subtask. The output of the visual understanding system is used by the action predictor RNN or LSTM. Based on the visual understanding of the environment, this component will plan the next move of the vehicle. The next move will be a combination of outcomes, i.e., applying brakes, accelerating, and/or steering the vehicle.

📝 We won’t be discussing input through Lidar here. However, it can also be used for scene analysis similar to a camera, especially for reconstructing a 3D ...