Gradient Ascent for Bayes Nets
- Let $w_{ijk}$ denote one entry in the conditional probability table for
variable $Y_i$ in the network
\[ w_{ijk} = P(Y_i=y_{ij}\,|\,Parents(Y_i) = \text{the list} u_{ik} \text{of values}) \]
e.g., if $Y_i = Campfire$, then $u_{ik}$ might be $\langle Storm=T, BusTourGroup=F \rangle$
- Perform gradient ascent by repeatedly
- update all $w_{ijk}$ using training data $D$
\[w_{ijk} \leftarrow w_{ijk} + \eta \sum_{d \in D} \frac{P_h(y_{ij}, u_{ik} \,|\,
d)}{w_{ijk}} \]
- then, re-normalize the $w_{ijk}$ to assure $\sum_{j} w_{ijk} = 1$ and $0 \leq w_{ijk} \leq 1$
- If some of the probabilities are not present in the data they can be derived via inference from the net.
- Again, this only finds locally optimal conditional
probabilities.
José M. Vidal
.
31 of 39