A key algorithm in the training of artificial neural networks is backpropagation. It calculates the gradient of the loss function — with reference to the weights of the network.
The gradient is used to update the weights. That enables the model to learn.
Backpropagation is initialized during the training phase after the forward pass — the input data is propelled further through the network to make predictions. Backpropagation is done before the backward pass (where gradients are calculated, and weights updated).
The network’s output or conjecture and the actual correct output are compared. Iteratively, the weights are adjusted. The network learns to map the input to the desired correct output. The network learns from its mistakes.
Say an input image is identified (prediction). This is compared to the actual answer. The difference between the prediction and the answer is then propagated backward through the network. While thus travelling backward, the weights between neurons are adjusted to minimize the error for future predictions.
Here calculus is leveraged. The chain rule is utilized. It determines how much each weight contributes to the overall error. These gradients are calculated. This way the algorithm identifies how the weights are to be adjusted so as to minimize the error and improve the network’s performance.
It is complex math but the idea is to make network learn iteratively to refine its internal connections based on the error it makes.
The weights are adjusted using gradient descent (a common optimization algorithm). The specific calculations use the chain rule of calculus repeatedly to differentiate the error function through the layer of the network.
Gradients provide the direction and magnitude for how much the function changes in response to changes in the input. In this context, the inputs are the weights connecting neurons.