🎨Softmax Regression

Multi class problems

We can learn it by likening it to logistic regression: πŸ˜‹

Recall that logistic regression produces a decimal between 0 and 1.0. For example, a logistic regression output of 0.8 from an email classifier suggests an 80% chance of an email being spam and a 20% chance of it being not spam. Clearly, the sum of the probabilities of an email being either spam or not spam is 1.0.

Softmax extends this idea into the MULTI-CLASS world. That is, Softmax assigns decimal probabilities to each class in a multi-class problem. Those decimal probabilities must add up to 1.0.

  • Its other name is Maximum Entropy (MaxEnt) Classifier

We can say that softmax regression generalizes logistic regression

Logistic regression is a special status of softmax where C = 2 πŸ€”

πŸ“š Notation

C = number of classes = number of units of the output layer So, y^j\hat{y}_j is a (C, 1) dimensional vector.

🎨 Softmax Layer

Softmax is implemented through a neural network layer just before the output layer. The Softmax layer must have the same number of nodes as the output layer.

πŸ’₯ Softmax Activation Function


πŸ”¨ Hard Max function

Takes the output of softmax layer and convert it into 1 vs 0 vector (as I called it 🀭) which will be our yΜ‚

For example:

t = 0.13  ==> Μ‚y = 0
    0.75          1
    0.01          0
    0.11          0

And so on 🐾

πŸ”Ž Loss Function


Y and yΜ‚ are (C,m) dimensional matrices πŸ‘©β€πŸ”§

🧐 Read More

Last updated