# Actors and Actor-Critics

Learn about policy search and actor-critic schemes.

## We'll cover the following

## Policy search

So far, we have focused on finding a value function, and we derived from this the greedy policy as the action that leads to the state with the largest return. The value function can be seen as a **critic** that adapts the policy. Another approach, especially when using function approximators, is to consider a parameterized policy directly and search for good policy parameters. Such an approach is called an **actor**. We need to find parameters that maximize the payoff. We illustrated such a setting in this

