Reinforcement Learning

Q-Learning Algorithm

Q-Learning Algorithm
  1. For each $s,a$ set $\hat{Q}(s,a) = 0$.
  2. Observe the current state $s$.
  3. Select an action $a$ and execute it.
  4. Receive reward $r$.
  5. Update the table entry for $\hat{Q}(s,a)$ with \[ \hat{Q}(s,a) \leftarrow r + \gamma \max_{a'}\hat{Q}(s',a') \]
  6. $s \leftarrow s'$ //Observe the new state.
  7. Goto 3.

José M. Vidal .

13 of 22