πŸ“š General Concepts of Sequence Models

πŸ‘©β€πŸ« Notation

In the context of text processing (e.g: Natural Language Processing NLP)

Symbol

Description

​X<t>X^{<t>}​

The tth word in the input sequence

​Y<t>Y^{<t>}​

The tth word in the output sequence

​X(i)<t>X^{(i)<t>}​

The tth word in the ith input sequence

​Y(i)<t>Y^{(i)<t>}​

The tth word in the ith output sequence

​Tx(i)T^{(i)}_x​

The length of the ith input sequence

​Ty(i)T^{(i)}_y​

The length of the ith output sequence

πŸš€ One Hot Encoding

A way to represent words so we can treat with them easily

πŸ”Ž Example

Let's say that we have a dictionary that consists of 10 words (🀭) and the words of the dictionary are:

  • Car, Pen, Girl, Berry, Apple, Likes, The, And, Boy, Book.

Our X(i)X^{(i)} is: The Girl Likes Apple And Berry

So we can represent this sequence like the following πŸ‘€

Car -0) ⌈ 0 βŒ‰ ⌈ 0 βŒ‰ ⌈ 0 βŒ‰ ⌈ 0 βŒ‰ ⌈ 0 βŒ‰ ⌈ 0 βŒ‰
Pen -1) | 0 | | 0 | | 0 | | 0 | | 0 | | 0 |
Girl -2) | 0 | | 1 | | 0 | | 0 | | 0 | | 0 |
Berry -3) | 0 | | 0 | | 0 | | 0 | | 0 | | 1 |
Apple -4) | 0 | | 0 | | 0 | | 1 | | 0 | | 0 |
Likes -5) | 0 | | 0 | | 1 | | 0 | | 0 | | 0 |
The -6) | 1 | | 0 | | 0 | | 0 | | 0 | | 0 |
And -7) | 0 | | 0 | | 0 | | 0 | | 1 | | 0 |
Boy -8) | 0 | | 0 | | 0 | | 0 | | 0 | | 0 |
Book -9) ⌊ 0 βŒ‹ ⌊ 0 βŒ‹ ⌊ 0 βŒ‹ ⌊ 0 βŒ‹ ⌊ 0 βŒ‹ ⌊ 0 βŒ‹

By representing sequences in this way we can feed out data to neural networks ✨

πŸ™„ Disadvantage

  • If our dictionary consists of 10,000 words so each vector will be 10,000 dimensional πŸ€•

  • This representation can not capture semantic features πŸ’”