This is part 3 of an ongoing series on language models, starting with defining neural machine translation and exploring the transformer model architecture, and then implementing GPT-2 from scratch.
Specifically, in the first part of the series, we implemented a transformer model from scratch, talked about language models in general, and also created a Neural Machine Translator.
And in the second part, we implemented GPT-2, starting with the transformer model code from the part 1. And we also learned how to actually generate text using the GPT-2 model.
In part 3, we’ll be creating a language model using Hugging Face’s pre-trained GPT-2.
Since we’ve discussed what GPT-2 is, and since we’ve explored the model structure and attributes in part 2, I won’t be discussing this again—if you need a refresher on what GPT-2 is all about, check out part 2!
In this section of this series, I’ll try to go straight to the point as much as possible.
#gpt-2 #machine-learning #nlp #hugging-face #heartbeat #deep learning