π©βπ§ NN Regularization
Preventing overfitting
Briefly: A technique to prevent overfitting -and reduce variance-
π Problem
In over-fitting situation, our model tries to learn too well the details and the noise from the training data, which ultimately results in poor performance on the unseen data (test set).
The following graph describes better:
π©βπ« Better Definition for Regularization
It is a technique which makes slight modifications to the learning algorithm such that the model generalizes better. This in turn improves the modelβs performance on the unseen data as well.
π¨ Regularization Techniques
π© L2 Regularization (Weight decay)
The most common type of regularization, given by following formula:
Here, lambda is the regularization parameter. It is the hyperparameter whose value is optimized for better results. L2 regularization is also known as weight decay as it forces the weights to decay towards zero (but not exactly zero)
π© Dropout
Another regularization method by eliminating some neurons in a specific ratio randomly
Simply: For each node of probability p, donβt update its input or output weights during backpropagation (Just drop it π )
Better visualiztion:
An NN before and after dropout
It is commonly used in computer vision, but its downside is that Cost function J is no longer well defined
π€‘ Data Augmentation
The simplest way to reduce overfitting is to increase the size of the training data, it is not always possible since getting more data is too costly, but sometimes we can increase our data based on our data, for example:
Doing transformations on images can maximize our data set
π Early Stopping
It is a kind of cross-validation strategy where we keep one part of the training set as the validation set. When we see that the performance on the validation set is getting worse, we immediately stop the training on the model. This is known as early stopping.
π§ Read More
Long Story Short π : Overfitting and Regularization in Neural Networks
Last updated