Information Gain

The information gain is the expected reduction in entropy caused by partitioning the examples with respect to an attribute.
Given $S$ is the set of example, $A$ the attribute, and $S_v$ the subset of $S$ for which attribute $A$ has value $v$: \[ Gain(S,A) \equiv Entropy(S) - \sum_{v \in Values(A)} \frac{|S_{v}|}{|S|} Entropy(S_{v}) \]
That is, current entropy minus new entropy.
Using our set of examples we can now calculate that
- Original Entropy = 0.94
- Humidity = High entropy = 0.985
- Humidity = Normal entropy = 0.592
- $Gain (S,Humidity) = .94 - \left(\frac{7}{14}\right).984 - \left(\frac{7}{14}\right).592 = .151$
- Wind = Weak entropy = 0.811
- Wind = Strong entropy = 1.0
- $Gain (S,Wind) = .94 - \left(\frac{8}{14}\right).811 - \left(\frac{6}{14}\right)1.0 = .048$
So Humidity provides a greater information gain.