Begin with a network with no hidden units then grow as
needed by adding new one until the error is reduced. For
example, the Cascade-Correlation algorithm.
It is easy to overfit the data.
Begin with a complex network and prune it as we find unessential connections.
Weight is almost 0.
Consider way in which a small variation in the weight
affects the Error. More successful method.
In general, they show mixed success in accuracy but can
improve training times.