Search⌘ K
AI Features

Training and Results

Explore how to train actor-critic networks using GAIL and PPO algorithms for reinforcement learning in video game environments. Understand how to classify expert and agent actions, update policies with minibatches, and evaluate reward progress and discriminator performance during training. This lesson provides hands-on experience with imitation learning techniques to improve agent behavior.

We'll cover the following...

To train the actor-critic network, we apply a loss function that tries to classify expert (observation, action) pairs as 00s and agent (observation, action) pairs as 11s. When the agent learns to generate high quality (observation, action) pairs that resemble the expert, the discriminator will have increasing difficulty distinguishing between samples from the agent and expert and will assign agent samples a label of ...