Calculating $\hat{V}$ will require training examples. An example looks like:
\[ \langle \langle bp = 2, rp = 4, bk = 1, rk=0, bt=0, rt=0\rangle, +100\rangle \]
$V_{train}(b)$ is the training value for $b$, for example:
\[ V_{train}(\langle bp = 2, rp = 4, bk = 1, rk=0, bt=0,
rt=0\rangle) = 100 \]
Its easy to assign values to final board states. For intermediate boards we can use
\[ V_{train}(b) \leftarrow \hat{V}(Successor(b)) \]
$Successor(b)$ gives the next board state following $b$
for which it is again the program's turn to move.