It was last year in February, as OpenAI published results on their training of unsupervised language model GPT-2. Trained in 40Gb texts (8 Mio websites) and was able to predict words in proximity. GPT-2, a transformer-based language applied to self-attention, allowed us to generated very convincing and coherent texts. The quality was that good, so the main model with 1.5 billion parameters wasn’t initially publicly accessible, to prevent uncontrolled fake news. Luckily, the complete model was later published and could be even used with Colab Notebooks.
This year OpenAI strikes back with new language model GPT-3. With 175 billion parameters (read also: GPT-3 Paper).
Unnecessary spoiler: it’s incredibly good.
There are already some profound articles on TDS examining features and paper of GPT-3:
The race for larger language models is entering the next round.
Pushing Deep Learning to the Limit with 175B Parameters
OpenAI is building an API, currently accessible via waiting list:
An API for accessing new AI models developed by OpenAI
Fortunately, I could get access and experiment with GPT-3 directly. Here are some of my initial outcomes.
Screenshot: beta.openai.com // by: Merzmensch
The AI Playground interface looks simple, but it bears the power within. For the first, here is a **setting **dialog, which lets you configure text length, temperature (from low/boring to standard to chaotic/creative), and other features.
You also can define where the generated text has to start and to stop, these are some of the control functions that have a direct impact on textual results.
The simple interface provides also some GPT-3 presets. The amazing thing about transformer-driven GPT-models is among others the ability to recognize a specific style, text character, or structure. In case you begin with lists, GPT-3 continues generating lists. In case your prompt has a Q&A structure, it will be kept coherently. If you ask for a poem, it writes a poem.
#artificial-intelligence #naturallanguageprocessing #art #machine-learning #gpt-3 #deep learning