Since the networks have multiple output units we must redefine to sum the errors
where and are the target and output values
associate with the kth output unit and example .
After some algebra (not shown) we can derive that the error slope for each weight is