πŸ•ΈοΈ
Deep Learning
  • πŸ’«Deep Learning Notes
  • πŸ’ΌPractical Tools
  • πŸ’ŽConcepts of Neural Networks
    • 🌱Introduction
    • πŸ”ŽThe Problem in General
    • πŸ‘·β€β™€οΈ Implementation Notes
    • πŸ“šCommon Concepts
    • πŸ’₯Activation Functions
    • 🎈Practical Aspects
    • πŸ‘©β€πŸ”§ NN Regularization
    • ✨Optimization Algorithms
    • 🎨Softmax Regression
    • πŸƒβ€β™€οΈ Introduction to Tensorflow
    • πŸ‘©β€πŸ’» Python Code Snippets
  • πŸ™‹β€β™€οΈ Hello World of Deep Learning with Neural Networks
    • 🌱Introduction
    • 🌐CNNs In Browser
  • πŸšͺIntroduction to Computer Vision
    • 🌱Introduction
  • 🚩Concepts of Convolutional Neural Networks
    • 🌱Introduction
    • πŸ“ŒCommon Concepts
    • 🌟Advanced Concepts
    • πŸ‘€Visualization
    • πŸ‘΅Classic Networks
    • ✨Other Approaches
    • πŸ•ΈοΈCommon Applications
  • πŸ‘©β€πŸ’» Works and Notes on CNNs
    • 🌱Introduction
  • πŸ’„Popular Strategies of Deep Learning
    • 🌱Introduction
    • πŸš™Transfer Learning
    • πŸ“šOther Strategies
  • 🀑Image Augmentation
    • 🌱Introduction
  • πŸ€Έβ€β™€οΈ Notes on Applied Machine Learning
    • 🌱Introduction
    • πŸ‘©β€πŸ”§ Notes on Structuring Machine Learning Projects
    • πŸ‘©β€πŸ« Implementation Guidelines
  • πŸ•΅οΈβ€β™€οΈ Basics of Object Detection
    • 🌱Introduction
    • β­•Region-Based CNNs
    • 🀳SSD and YOLO
    • πŸ€–TensorFlow Object Detection API
    • 🐞Model Debugging
  • ➰Sequence Models In Deep Learning
    • 🌱Introduction
    • πŸ“šGeneral Concepts
    • πŸ”„Recurrent Neural Networks
    • 🌌Vanishing Gradients with RNNs
    • 🌚Word Representation
    • πŸ’¬Mixed Info On NLP
  • πŸ’¬NLP
    • 🌱Introduction
  • πŸ’¬Applied NLP
    • πŸ™ŒπŸ» Handling texts
    • 🧩Regex
  • πŸ‘€Quick Visual Info
  • πŸ“šPDFs that I found and recommend
Powered by GitBook
On this page
  • 🍭 Basic Neural Network
  • πŸ“š Common Terms
  • 🧠 What does an artificial neuron do?
  • πŸ‘©β€πŸ”§ Parameters Dimension Control
  • 🎈 Summary of Forward Propagation Process
  • πŸ‘©β€πŸ”§ Vectorized Equations
  • 🎈 Summary of Back Propagation Process
  • πŸ‘©β€πŸ”§ Vectorized Equations
  • ➰➰ To Put Forward Prop. and Back Prop. Together
  • ✨ Parameters vs Hyperparameters
  • πŸ‘©β€πŸ« Parameters
  • πŸ‘©β€πŸ”§ Hyperparameters

Was this helpful?

Export as PDF
  1. Concepts of Neural Networks

Common Concepts

PreviousπŸ‘·β€β™€οΈ Implementation NotesNextActivation Functions

Last updated 4 years ago

Was this helpful?

Basic Concepts of ANN

🍭 Basic Neural Network

Convention: The NN in the image called to be a 2-layers NN since input layer is not being counted πŸ“’β—

πŸ“š Common Terms

Term

Description

🌚 Input Layer

A layer that contains the inputs to the NN

🌜 Hidden Layer

The layer(s) where computational operations are being done

🌝 Output Layer

The final layer of the NN and it is responsible for generating the predicted value yΜ‚

🧠 Neuron

A placeholder for a mathematical function, it applies a function on inputs and provides an output

πŸ’₯ Activation Function

A function that converts an input signal of a node to an output signal by applying some transformation

πŸ‘Ά Shallow NN

NN with few number of hidden layers (one or two)

πŸ’ͺ Deep NN

NN with large number of hidden layers

Number of units in l layer

🧠 What does an artificial neuron do?

It calculates a weighted sum of its input, adds a bias and then decides whether it should be fired or not due to an activaiton function

πŸ‘©β€πŸ”§ Parameters Dimension Control

Parameter

Dimension

Making sure that these dimensions are true help us to write better and bug-free :bug: codes

🎈 Summary of Forward Propagation Process

Input:

Output:

πŸ‘©β€πŸ”§ Vectorized Equations

Z[l]=W[l]A[lβˆ’1]+b[l]Z^{[l]} =W^{[l]}A^{[l-1]}+b^{[l]}Z[l]=W[l]A[lβˆ’1]+b[l] A[l]=g[l](Z[l])A^{[l]} = g^{[l]}(Z^{[l]})A[l]=g[l](Z[l])

🎈 Summary of Back Propagation Process

Input:

Output:

πŸ‘©β€πŸ”§ Vectorized Equations

dZ[l]=dA[l]βˆ—g[l]β€²(Z[l])dZ^{[l]}=dA^{[l]} * {g^{[l]}}'(Z^{[l]})dZ[l]=dA[l]βˆ—g[l]β€²(Z[l])

dW[l]=1mdZ[l]A[lβˆ’1]TdW^{[l]}=\frac{1}{m}dZ^{[l]}A^{[l-1]T}dW[l]=m1​dZ[l]A[lβˆ’1]T

db[l]=1mnp.sum(dZ[l],axis=1,keepdims=True)db^{[l]}=\frac{1}{m}np.sum(dZ^{[l]}, axis=1, keepdims=True)db[l]=m1​np.sum(dZ[l],axis=1,keepdims=True)

dA[lβˆ’1]=W[l]TdZ[l]dA^{[l-1]}=W^{[l]T}dZ^{[l]}dA[lβˆ’1]=W[l]TdZ[l]

➰➰ To Put Forward Prop. and Back Prop. Together

πŸ˜΅πŸ€•

✨ Parameters vs Hyperparameters

πŸ‘©β€πŸ« Parameters

  • W[1],W[2],W[3]W^{[1]}, W^{[2]}, W^{[3]}W[1],W[2],W[3]

  • b[1],b[2]b^{[1]}, b^{[2]}b[1],b[2]

  • ......

πŸ‘©β€πŸ”§ Hyperparameters

  • Learning rate

  • Number of iterations

  • Number of hidden layers

  • Number of hidden units

  • Choice of activation function

  • ......

We can say that hyperparameters control parameters πŸ€”

My detailed notes on activation functions are πŸ‘©β€πŸ«

πŸ’Ž
πŸ“š
n[l]n^{[l]}n[l]
w[l]w^{[l]}w[l]
(n[l],n[lβˆ’1])(n^{[l]},n^{[l-1]})(n[l],n[lβˆ’1])
b[l]b^{[l]}b[l]
(n[l],1)(n^{[l]},1)(n[l],1)
dw[l]dw^{[l]}dw[l]
(n[l],n[lβˆ’1])(n^{[l]},n^{[l-1]})(n[l],n[lβˆ’1])
db[l]db^{[l]}db[l]
(n[l],1)(n^{[l]},1)(n[l],1)
a[lβˆ’1]a^{[l-1]}a[lβˆ’1]
a[l],chache(z[l])a^{[l]}, chache (z^{[l]})a[l],chache(z[l])
da[l]da^{[l]}da[l]
da[lβˆ’1],dW[l],db[l]da^{[l-1]}, dW^{[l]}, db^{[l]}da[lβˆ’1],dW[l],db[l]
here