Artificial general intelligence, or AGI, has become a much-abused buzzword in the AI industry. Now, Google DeepMind wants to put the idea on a firmer footing.
The concept at the heart of the term AGI is that a hallmark of human intelligence is its generality. While specialist computer programs might easily outperform us at picking stocks or translating French to German, our superpower is the fact we can learn to do both.
Recreating this kind of flexibility in machines is the holy grail for many AI researchers, and is often speculated to be the first step towards artificial superintelligence. But what exactly people mean by AGI is rarely specified, and the idea is frequently described in binary terms, where AGI represents a piece of software that has crossed some mythical boundary, and once on the other side, it’s on par with humans.
Researchers at Google DeepMind are now attempting to make the discussion more precise by concretely defining the term. Crucially, they suggest that rather than approaching AGI as an end goal, we should instead think about different levels of AGI, with today’s leading chatbots representing the first rung on the ladder.
“We argue that it is critical for the AI research community to explicitly reflect on what we mean by AGI, and aspire to quantify attributes like the performance, generality, and autonomy of AI systems,” the team writes in a preprint published on arXiv.
The researchers note that they took inspiration from autonomous driving, where capabilities are split into six levels of autonomy, which they say enable clear discussion of progress in the field.
To work out what they should include in their own framework, they studied some of the leading definitions of AGI proposed by others. By looking at some of the core ideas shared across these definitions, they identified six principles any definition of AGI needs to conform with.
For a start, a definition should focus on capabilities rather than the specific mechanisms AI uses to achieve them. This removes the need for AI to think like a human or be conscious to qualify as AGI.
They also suggest that generality alone is not enough for AGI, the models also need to hit certain thresholds of performance in the tasks they perform. This performance doesn’t need to be proven in the real world, they say—it’s enough to simply demonstrate a model has the potential to outperform humans at a task.
While some believe true AGI will not be possible unless AI is embodied in physical robotic machinery, the DeepMind team say this is not a prerequisite for AGI. The focus, they say, should be on tasks that fall in the cognitive and metacognitive—for instance, learning to learn—realms.
Another requirement is that benchmarks for progress have “ecological validity,” which means AI is measured on real-world tasks valued by humans. And finally, the researchers say the focus should be on charting progress in the development of AGI rather than fixating on a single endpoint.
Based on these principles, the team proposes a framework they call “Levels of AGI” that outlines a way to categorize algorithms based on their performance and generality. The levels range from “emerging,” which refers to a model equal to or slightly better than an unskilled human, to “competent,” “expert,” “virtuoso,” and “superhuman,” which denotes a model that outperforms all humans. These levels can be applied to either narrow or general AI, which helps distinguish between highly specialized programs and those designed to solve a wide range of tasks.
The researchers say some narrow AI algorithms, like DeepMind’s protein-folding algorithm AlphaFold, for instance, have already reached the superhuman level. More controversially, they suggest leading AI chatbots like OpenAI’s ChatGPT and Google’s Bard are examples of emerging AGI.
Julian Togelius, an AI researcher at New York University, told MIT Technology Review that separating out performance and generality is a useful way to distinguish previous AI advances from progress towards AGI. And more broadly, the effort helps to bring some precision to the AGI discussion. “This provides some much-needed clarity on the topic,” he says. “Too many people sling around the term AGI without having thought much about what they mean.”
The framework outlined by the DeepMind team is unlikely to win everyone over, and there are bound to be disagreements about how different models should be ranked. But with any luck, it will get people to think more deeply about a critical concept at the heart of the field.