The stochastic gradient descent algorithm is a
simple variation on gradient descent: instead of adding up the
gradients over all examples, we update weights
incrementally.
That is, change the weights after each example.
For each training example in we compute the
gradient and then
We can view this as going down one of the different
landscapes (one for each ) defined as
at each step.