Entropy
- $S$ is a sample of training examples
- $p_{\oplus}$ is the proportion of positive examples in
$S$.
- $p_{\ominus}$ is the proportion of negative examples in
$S$.
- Entropy measures the impurity of $S$
\[ Entropy(S) \equiv - p_{\oplus} \log_{2} p_{\oplus} -
p_{\ominus} \log_{2} p_{\ominus} \]
- For example, if $S$ has 9 positive and 5 negative examples, its entropy is
\[Entropy([9+,5-]) = -\left(\frac{9}{14}\right)\log_2\left(\frac{9}{14}\right) - \left(\frac{5}{14}\right)log_2\left(\frac{5}{14}\right) = 0.94\]
- This function is 0 for $p_{\oplus} = 0$ and $p_{\oplus} =
1$. It reaches its maximum of 1 when $p_{\oplus} = .5$
- That is, it is maximized when there degree of “confusion” is maximized.
José M. Vidal
.
7 of 25