πŸ•ΈοΈ
Deep Learning
  • πŸ’«Deep Learning Notes
  • πŸ’ΌPractical Tools
  • πŸ’ŽConcepts of Neural Networks
    • 🌱Introduction
    • πŸ”ŽThe Problem in General
    • πŸ‘·β€β™€οΈ Implementation Notes
    • πŸ“šCommon Concepts
    • πŸ’₯Activation Functions
    • 🎈Practical Aspects
    • πŸ‘©β€πŸ”§ NN Regularization
    • ✨Optimization Algorithms
    • 🎨Softmax Regression
    • πŸƒβ€β™€οΈ Introduction to Tensorflow
    • πŸ‘©β€πŸ’» Python Code Snippets
  • πŸ™‹β€β™€οΈ Hello World of Deep Learning with Neural Networks
    • 🌱Introduction
    • 🌐CNNs In Browser
  • πŸšͺIntroduction to Computer Vision
    • 🌱Introduction
  • 🚩Concepts of Convolutional Neural Networks
    • 🌱Introduction
    • πŸ“ŒCommon Concepts
    • 🌟Advanced Concepts
    • πŸ‘€Visualization
    • πŸ‘΅Classic Networks
    • ✨Other Approaches
    • πŸ•ΈοΈCommon Applications
  • πŸ‘©β€πŸ’» Works and Notes on CNNs
    • 🌱Introduction
  • πŸ’„Popular Strategies of Deep Learning
    • 🌱Introduction
    • πŸš™Transfer Learning
    • πŸ“šOther Strategies
  • 🀑Image Augmentation
    • 🌱Introduction
  • πŸ€Έβ€β™€οΈ Notes on Applied Machine Learning
    • 🌱Introduction
    • πŸ‘©β€πŸ”§ Notes on Structuring Machine Learning Projects
    • πŸ‘©β€πŸ« Implementation Guidelines
  • πŸ•΅οΈβ€β™€οΈ Basics of Object Detection
    • 🌱Introduction
    • β­•Region-Based CNNs
    • 🀳SSD and YOLO
    • πŸ€–TensorFlow Object Detection API
    • 🐞Model Debugging
  • ➰Sequence Models In Deep Learning
    • 🌱Introduction
    • πŸ“šGeneral Concepts
    • πŸ”„Recurrent Neural Networks
    • 🌌Vanishing Gradients with RNNs
    • 🌚Word Representation
    • πŸ’¬Mixed Info On NLP
  • πŸ’¬NLP
    • 🌱Introduction
  • πŸ’¬Applied NLP
    • πŸ™ŒπŸ» Handling texts
    • 🧩Regex
  • πŸ‘€Quick Visual Info
  • πŸ“šPDFs that I found and recommend
Powered by GitBook
On this page
  • πŸ“š Notation
  • 🎨 Softmax Layer
  • πŸ’₯ Softmax Activation Function
  • πŸ”¨ Hard Max function
  • πŸ”Ž Loss Function
  • 🧐 Read More

Was this helpful?

Export as PDF
  1. Concepts of Neural Networks

Softmax Regression

Multi class problems

PreviousOptimization AlgorithmsNextπŸƒβ€β™€οΈ Introduction to Tensorflow

Last updated 4 years ago

Was this helpful?

We can learn it by likening it to logistic regression: πŸ˜‹

Recall that logistic regression produces a decimal between 0 and 1.0. For example, a logistic regression output of 0.8 from an email classifier suggests an 80% chance of an email being spam and a 20% chance of it being not spam. Clearly, the sum of the probabilities of an email being either spam or not spam is 1.0.

Softmax extends this idea into the MULTI-CLASS world. That is, Softmax assigns decimal probabilities to each class in a multi-class problem. Those decimal probabilities must add up to 1.0.

  • Its other name is Maximum Entropy (MaxEnt) Classifier

We can say that softmax regression generalizes logistic regression

Logistic regression is a special status of softmax where C = 2 πŸ€”

πŸ“š Notation

C = number of classes = number of units of the output layer So, y^j\hat{y}_jy^​j​ is a (C, 1) dimensional vector.

🎨 Softmax Layer

Softmax is implemented through a neural network layer just before the output layer. The Softmax layer must have the same number of nodes as the output layer.

πŸ’₯ Softmax Activation Function

Softmax(xi)exp(xi)βˆ‘jexp(xj)Softmax(x_i)\frac{exp(x_i)}{\sum_{j}exp(x_j)}Softmax(xi​)βˆ‘j​exp(xj​)exp(xi​)​

πŸ”¨ Hard Max function

Takes the output of softmax layer and convert it into 1 vs 0 vector (as I called it 🀭) which will be our yΜ‚

For example:

t = 0.13  ==> Μ‚y = 0
    0.75          1
    0.01          0
    0.11          0

And so on 🐾

πŸ”Ž Loss Function

L(y^,y)=βˆ’βˆ‘j=1cyjlog(y^j)L(\hat{y},y)=-\sum_{j=1}^{c}y_jlog(\hat{y}_j)L(y^​,y)=βˆ’βˆ‘j=1c​yj​log(y^​j​)

Y and yΜ‚ are (C,m) dimensional matrices πŸ‘©β€πŸ”§

🧐 Read More

πŸ’Ž
🎨
Long story short from Google documentation