In this repository, I have explained the complete architecture of Transformer with the help of code.
-
1_self_attention: In this code the complete mechanism of self-attention is explained.
-
2_multi_head_attention: In this code the multi-head mechanism is explained.
-
3_positional_encoding: In this code the positional embedding mechanism is explained.
-
4_Layer_normalization: In this code the ADD & Norm mechanism is explained.
-
5_1_Encoder_architecture: In this code the complete go through of encoder single layer is given. (5_2_Encoder_architecture is just the .py version of same code.)
-
Decoder_architecture: In this code the the complete go through of Decoder single layer is given.
-
7_sentence_tokenization: In this code the Tokenization mechanism is explained.
-
8_complete_transformer: In this code I have constructed the complete transformer model. (with help of above code)
Key things to note:
Query: What I'm looking for
Key: What I've to offer
Value: what I actually offer