πŸ•Έ Common Applications of CNNs

Application

Description

πŸ§’πŸ‘§ Face Verification

Recognizing if that the given image and ID are belonging to the same person

πŸ‘Έ Face Recognition

Assigning ID to the input face image

🌠 Neural Style Transfer

Converting an image to another by learning the style from a specific image

πŸ§’πŸ‘§ Face Verification

πŸ™Œ Comparison

Term

Question

Input

Output

Problem Class

πŸ§’πŸ‘§ Face Verification

Is this the claimed person? πŸ•΅οΈβ€β™‚οΈ

Face image / ID

True / False

1:1

πŸ‘Έ Face Recognition

Who is this person? 🧐

Face image

ID of K faces in DB

1:K

πŸ€Έβ€β™€οΈ Solving Approach

🀳 One Shot Learning

Learning from one example (that we have in the database) to recognize the person again

πŸ–‡ The Process

  • Get input image

  • Check if it belongs to the faces you have in the DB

πŸ‘“ How to Check?

We have to calculate the similarity between the input image and the image in the database, so:

  • β­• Use some function that

    • similarity(img_in, img_db) = some_val

  • πŸ‘·β€β™€οΈ Specifiy a threshold value

  • πŸ•΅οΈβ€β™€οΈ Check the threshold and specify the output

πŸ€” What can the similarity function be?

πŸ”· Siamese Network

A CNN which is used in face verification context, it recievs two images as input, after applying convolutions it calculates a feature vector from each image and, calculates the difference between them and then gives outputs decision.

In other words: it encodes the given images

πŸ‘€ Visualization

Architecture:

πŸ‘©β€πŸ« How to Train?

We can train the network by taking an anchor (basic) image A and comparing it with both a positive sample P and a negative sample N. So that:

  • 🚧 The dissimilarity between the anchor image and positive image must low

  • 🚧 The dissimilarity between the anchor image and the negative image must be high

So:

​L=max(d(a,p)βˆ’d(a,n)+margin,0)L=max(d(a,p)-d(a,n)+margin, 0)​

Another variable called margin, which is a hyperparameter is added to the loss equation. Margin defines how far away the dissimilarities should be, i.e if margin = 0.2 and d(a,p) = 0.5 then d(a,n) should at least be equal to 0.7. Margin helps us distinguish the two images better πŸ€Έβ€β™€οΈ

Therefore, by using this loss function we:

  • πŸ‘©β€πŸ« Calculate the gradients and with the help of the gradients

  • πŸ‘©β€πŸ”§ We update the weights and biases of the Siamese network.

For training the network, we:

  • πŸ‘©β€πŸ« Take an anchor image and randomly sample positive and negative images and compute its loss function

  • πŸ€Ήβ€β™‚οΈ Update its gradients

🌠 Neural Style Transfer

Generating an image G by giving a content image C and a style image S

πŸ‘€ Visualization

So to generate G, our NN has to learn features from S and apply suitable filters on C

πŸ‘©β€πŸŽ“ Methodology

Usually we optimize the parameters -weights and biases- of the NN to get the wanted performance, here in Neural Style Transfer we start from a blank image composed of random pixel values, and we optimize a cost function by changing the pixel values of the image 🧐

In other words, we:

  • β­• Start with a blank image consists of random pixels

  • πŸ‘©β€πŸ« Define some cost function J

  • πŸ‘©β€πŸ”§ Iteratively modify each pixel so as to minimize our cost function

Long story short: While training NNs we update our weights and biases, but in style transfer, we keep the weights and biases constant, and instead update our image itself πŸ™Œ

⌚ Cost Function

We can define J as

​J(G)=Ξ±JContent(C,G)+Ξ²JStyle(S,G)J(G)=\alpha J_{Content}(C,G)+\beta J_{Style}(S,G)​

Which:

  • ​JContentJ_{Content} denotes the similarity between G and C

  • ​JStyleJ_{Style} denotes the similarity between G and S

  • Ξ± and Ξ² hyperparameters