The Unpredictable Abilities Emerging From Large AI Models
...
The Emergence of Emergence
Biologists, physicists, ecologists and other scientists use the term “emergent” to describe self-organizing, collective behaviors that appear when a large collection of things acts as one. Combinations of lifeless atoms give rise to living cells; water molecules create waves; murmurations of starlings swoop through the sky in changing but identifiable patterns; cells make muscles move and hearts beat. Critically, emergent abilities show up in systems that involve lots of individual parts. But researchers have only recently been able to document these abilities in LLMs as those models have grown to enormous sizes. ...
What makes a model “recurrent” is that it learns from its own output: Its predictions feed back into the network to improve future performance. ...
But the debut of LLMs also brought something truly unexpected. Lots of somethings. With the advent of models like GPT-3, which has 175 billion parameters — or Google’s PaLM, which can be scaled up to 540 billion — users began describing more and more emergent behaviors. One DeepMind engineer even reported being able to convince ChatGPT that it was a Linux terminal and getting it to run some simple mathematical code to compute the first 10 prime numbers. Remarkably, it could finish the task faster than the same code running on a real Linux machine. ...
..., researchers had no reason to think that a language model built to predict text would convincingly imitate a computer terminal. Many of these emergent behaviors illustrate “zero-shot” or “few-shot” learning, which describes an LLM’s ability to solve problems it has never — or rarely — seen before. ...
Unpredictable Powers and Pitfalls
There is an obvious problem with asking these models to explain themselves: They are notorious liars. “We’re increasingly relying on these models to do basic work,” Ganguli said, “but I do not just trust these. I check their work.” ...
“It’s hard to know in advance how these models will be used or deployed,” Ganguli said. “And to study emergent phenomena, you have to have a case in mind, and you won’t know until you study the influence of scale what capabilities or limitations might arise.” ...
“Certain harmful behaviors kind of come up abruptly in some models,” Ganguli said. He points to a recent analysis of LLMs, known as the BBQ benchmark, which showed that social bias emerges with enormous numbers of parameters. “Larger models abruptly become more biased.” Failure to address that risk, he said, could jeopardize the subjects of these models.
But he offers a counterpoint: When the researchers simply told the model not to rely on stereotypes or social biases — literally by typing in those instructions — the model was less biased in its predictions and responses. This suggests that some emergent properties might also be used to reduce bias. ...
“We spend a lot of time just chatting with our models,” he said, “and that is actually where you start to get a good intuition about trust — or the lack thereof.” ...
See the full story here: https://www.quantamagazine.org/the-unpredictable-abilities-emerging-from-large-ai-models-20230316/
Pages
- About Philip Lelyveld
- Mark and Addie Lelyveld Biographies
- Presentations and articles
- Tufts Alumni Bio