This is part 3 of an ongoing series on language models, starting with defining neural machine translation and exploring the transformer model architecture, and then implementing GPT-2 from scratch.

Specifically, in the first part of the series, we implemented a transformer model from scratch, talked about language models in general, and also created a Neural Machine Translator.

And in the second part, we implemented GPT-2, starting with the transformer model code from the part 1. And we also learned how to actually generate text using the GPT-2 model.

In part 3, we’ll be creating a language model using Hugging Face’s pre-trained GPT-2.

Since we’ve discussed what GPT-2 is, and since we’ve explored the model structure and attributes in part 2, I won’t be discussing this again—if you need a refresher on what GPT-2 is all about, check out part 2!

Goals:

  • Create an end-to-end approach to generate text using fine-tuned GPT-2 from Hugging Face.

In this section of this series, I’ll try to go straight to the point as much as possible.

#gpt-2 #machine-learning #nlp #hugging-face #heartbeat #deep learning

Generating Text with Hugging…
1.25 GEEK