Entropy as Encoding Length
- We can also say that $Entropy(S)$ equals the expected
number of bits needed to encode class ($\oplus$ or $\ominus$)
of randomly drawn member of $S$ using the optimal,
shortest-length code.
- Why?
- Information theory: optimal length code assigns $-
\log_{2}p$ bits to message having probability $p$.
- Imagine I'm choosing elements from $S$ at random and
telling you whether they are $\oplus$ or $\ominus$. How many
bits per element will I need? (We work-out encoding
beforehand).
- If message has probability 1 then its encoding length is
0. Why?
- If probability .5 then we need 1 bit (the maximum).
- So, the expected number of bits to encode whether a random
member of $S$ is $\oplus$ or $\ominus$ is:
of $S$:
\[ p_{\oplus} (-\log_{2} p_{\oplus}) + p_{\ominus} (-\log_{2} p_{\ominus}) \]
\[ Entropy(S) \equiv - p_{\oplus} \log_{2} p_{\oplus} - p_{\ominus} \log_{2}
p_{\ominus} \]
José M. Vidal
.
8 of 25