Dealing With Overfitting
- Either stop growing the tree earlier or prune it
after-wards. Pruning has been more effective.
- Use a separate set of examples (not training) to evaluate
the utility of post-pruning nodes.
- Use a statistical test to estimate whether expanding a
node is likely to improve performance beyond the training
set.
- Use explicit measure of the complexity for encoding the
training examples and the decision tree. Stop when this
encoding size is minimize. Minimum Description Length
principle (later).
José M. Vidal
.
19 of 25