What is GPT-3?

In February 2019, the artificial intelligence research lab OpenAI sent shockwaves through the world of computing by releasing the GPT-2 language model. Short for “Generative Pretrained Transformer 2,” GPT-2 is able to generate several paragraphs of natural language text — often impressively realistic and internally coherent — based on a short prompt.

Scarcely a year later, OpenAI has already outdone itself with GPT-3, a new generative language model that is bigger than GPT-2 by orders of magnitude. The largest version of the GPT-3 model has 175 billion parameters, more than 100 times the 1.5 billion parameters of GPT-2. (For reference, the number of neurons in the human brain is usually estimated as 85 billion to 120 billion, and the number of synapses is roughly 150 trillion.)

Just like its predecessor GPT-2, GPT-3 was trained on a simple task: given the previous words in a text, predict the next word. This required the model to consume very large datasets of Internet text, such as Common Crawl and Wikipedia, totalling 499 billion tokens (i.e. words and numbers).

But how does GPT-3 work under the hood? Is it really a major step up from GPT-2? And what are the possible implications and applications of the GPT-3 model?

#artificial-intelligence #gpt-3 #data-science #openai #machine-learning

The GPT-3 Model: What Does It Mean for Chatbots and Customer Service?
1.75 GEEK