Given a set of instances $X$, set of hypotheses $H$, set
of possible target concepts $C$, and training instances
generated by a fixed, unknown probability distribution
$\cal{D}$ over $X$
Learner observes a sequence $D$ of training examples of
form $\langle x, c(x)\rangle$, for some target concept $c \in
C$
The instances $x$ are drawn from distribution $\cal{D}$
and the teacher provides target value $c(x)$ for each.
Learner must output a hypothesis $h$ estimating $c$. $h$
is evaluated by its performance on subsequent instances drawn
according to $\cal{D}$