The biggest AI news of 2020 so far is the success of OpenAI’s monstrous new language model, GPT-3. In this post, I’m going to quickly summarize why GPT-3 has caused such a splash, before highlighting 3 consequences for individuals and companies building things with AI.

GPT-3: a very brief primer

Why are people excited about GPT-3? Here’s why, in 3 tweets:

What’s going on here?

There are already lots of summary posts about GPT-3, so I won’t rehash them here.

For a great introduction to how the model works, check out this visual guide from the (reliably excellent) Jay Alammar. For a sober discussion of the model’s abilities and limitations, see Kevin Lacker’s Giving GPT-3 a Turing Test.

In short, GPT-3 is a model which is trained to autocomplete sentences. It’s been trained on huge chunks of the web. And it’s learned lots of interesting stuff along the way.

Why has this happened?

It turns out that memorizing lots of stuff is useful if you’re trying to autocomplete sentences from the internet. For instance, if you’re trying to finish phrases about Barack Obama, it’s helpful to memorize a bunch of stuff about him. How else can you complete the sentence “Barack Obama was born in ____”?

And so the model has learned a lot about Obama. And Trump. And anybody and everything which crops up regularly on the internet.

In fact, it’s not only learned facts, it’s learned to create stuff. You can’t create stuff by autocompleting sentences, but you can create stuff by autocompleting code. It turns out there’s lots of code on the internet, so the model has learned to write semi-coherent code. For instance, it’s learned to complete sentences which aren’t written in normal prose. It can complete lines written in coding languages, like HTML & CSS.

#machine-learning #startup #data-science #ai #naturallanguageprocessing

What does GPT-3 mean for AI?
1.10 GEEK