A learning that does not assume that the target
concept is representable by $H$ and simply finds the
hypothesis with minimum training error is an agnostic
learner.
We want the hypothesis $h$ that makes fewest errors on
training data.
The sample complexity is
\[ m \geq \frac{1}{2 \epsilon^{2}}(\ln|H| + \ln(1/\delta)) \]
derived from Hoeffding bounds:
\[ Pr[error_{\cal{D}}(h) > error_D(h) + \epsilon] \leq e^{-2m\epsilon^{2}} \]
were $D$ is a set containing $m$ randomly drawn examples.