The initial implementation was a bigram model that computes the next character when given a single character as a one hot endcoded vector input. This was further extended by building a multi layer perceptron which takes in three charcters and predicts the fourth character.
This implementation is based on Andrej Karpathy's nanoGPT tutorial. (https://www.youtube.com/watch?v=kCc8FmEb1nY&t=3994s)