Agnostic Learning

So far, we have assumed that assumed $c \in H$
A learning that does not assume that the target concept is representable by $H$ and simply finds the hypothesis with minimum training error is an agnostic learner.
We want the hypothesis $h$ that makes fewest errors on training data.
The sample complexity is \[ m \geq \frac{1}{2 \epsilon^{2}}(\ln|H| + \ln(1/\delta)) \] derived from Hoeffding bounds: \[ Pr[error_{\cal{D}}(h) > error_D(h) + \epsilon] \leq e^{-2m\epsilon^{2}} \] were $D$ is a set containing $m$ randomly drawn examples.