Assignment of Weights in a Neural Network

At the start of the training, the weights in a neural network are typically assigned arbitrarily. Precisely, there is a structural randomness, or they are randomly initialized.

Random Initialization

Symmetry is avoided. If weights were initialized to the same value (say zero), neurons in the same layer would learn the same things. That makes training ineffective.

Random weights ensure neurons process inputs differently, allowing gradients to flow, and useful features to emerge.

Initialization Methods

Uniform or Normal Random Initialization

Weights are chosen from a uniform or normal distribution. It is not an ideal method for deep networks due to issues such as vanishing or exploding gradients

Xavier Initialization

For sigmoid, the variance of activation is stable across layers.

He Initialization

For ReLU or variants weights are chosen. These are designed to maintain variance of activations and gradients.

Bias Initialization

It is typically initialized to zero since this does not affect the symmetry-breaking problem.

In short, weights are intialized randomly, but not arbitrarily. They are initialized in a way that ensures effective learning from the start.

Weights are initialized in PyTorch and Tensor Flow/Keras. One can also use GloroNormal for different activations.

print

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *