• GPT-3 is the largest language model trained today.
  • The basic operation mode of GPT-3 is to generate text responses based on the input text. Eg to answer a question or to write an essay based on a title.
  • OpenAI now provides a developer API to interact with GPT-3 and build applications on top of it.
  • GPT-3 is a few-shot learner. It requires priming with a few examples to work in a specific context.
  • Once primed correctly, GPT-3 could perform math calculations and generate answers in programming languages, although it has not learned either explicitly.

The first wave of GPT-3 enabled applications have stunned the “developer twitter”. They offer a glimpse of our AI future.

The GPT-3 (Generative Pre-Trained Transformer-3) is OpenAI’s latest and greatest natural language prediction model. Simply put, it generates text in response to any input text. It is a program that responds to questions or statements.

The GPT-3 is pre-trained with a large amount of natural language text from the Internet (45TB of training text with 499 billion words). It cost at least 4.6 million US dollars (some estimated as high as $12 million) to train on GPUs. The result model has 175 billion parameters.

InfoQ covered OpenAI’s GPT-3 announcement back in June. It is 100x bigger than any previous language AI model. In the official GPT-3 research study, the OpenAI team demonstrated that GPT-3 achieves state-of-the-art performance out-of-the-box without any fine tuning. But how does it work in the real world? Is it just another toy or a serious threat to humanity? Well, a month after its initial release, the first GPT-3 powered applications are emerging. Now we can see for ourselves.

We feel most developers can absolutely build projects using GPT-3 very quickly. - Yash Dani & George Saad

In this article, we interviewed many of these creators and entrepreneurs, and reviewed some of these applications. Developers and journalists alike have described GPT-3 as shockingly good and mind-blowing.

How it works

The GPT-3 model generates text one word at a time. As a hypothetical example, let’s say that a developer gives it the following words as input.

“Answer to the Ultimate Question of Life, the Universe, and Everything is”

The AI model could generate the word “forty” as the response. And then, the developer appends the generated word to the input and runs the model again.

“Answer to the Ultimate Question of Life, the Universe, and Everything is forty”

This time, the AI model could generate the word “two” as the response. Repeat again, and the next response should be the period sign, hence completing a sentence.

“Answer to the Ultimate Question of Life, the Universe, and Everything is forty-two.”

GPT-3 can do this because it has seen this particular pop culture reference many times from the text in its training. So, its neural network can guess the “next word” with a high degree of statistical certainty.

However, in natural language, predictions are not always so clear-cut. The word that follows an input often depends on the context. That is where GPT-3’s strength as a few-shot learner shows. Few-shot learning is to prime GPT-3 with a few examples, and then ask it to make predictions. That allows the user to give the AI model a language context, and dramatically improve accuracy. Figure 1 shows examples of zero-shot, one-shot, and few-shot learning to prime an AI model to generate foreign language translations.

#neural networks #artificial intelligence #machine learning #deep learning #deep learning #deep learning

The First Wave of GPT-3 Enabled Applications Offer a Preview of Our AI Future
1.45 GEEK