Quiz

Test yourself on reinforcement learning.

1

Which of the following is the formula of the transition function in the environmental setting of a Markov decision process?

A)

at=π(st)    deterministicπ(atst)      stochastic\begin{array}{ll} & a_{t}=\pi\left(s_{t}\right) \space \space \space \space \text{deterministic} \\ & \pi\left(a_{t} \mid s_{t}\right) \space \space \space \space \space \space \text{stochastic} \end{array}

B)

rt+1=ρ(st,at)    deterministicρ(rt+1st,at)      stochastic\begin{array}{l} r_{t+1}=\rho\left(s_{t}, a_{t}\right) \space\space\space\space \text{deterministic}\\ \rho\left(r_{t+1} \mid s_{t}, a_{t}\right)\space\space\space\space\space\space \text{stochastic} \end{array}

C)

st+1=τ(st,at)    deterministicτ(st+1st,at)      stochastic\begin{array}{l} s_{t+1}=\tau\left(s_{t}, a_{t}\right) \space\space\space\space \text{deterministic} \\ \tau\left(s_{t+1} \mid s_{t}, a_{t}\right) \space\space\space\space\space\space \text{stochastic} \end{array}

Question 1 of 60 attempted

Get hands-on with 1200+ tech skills courses.