Our brains are constantly learning. That new sandwich deli rocks. That gas station? Better avoid it in the future.
Memories like these physically rewire connections in the brain region that supports new learning. During sleep, the previous day’s memories are shuttled to other parts of the brain for long-term storage, freeing up brain cells for new experiences the next day. In other words, the brain can continuously soak up our everyday lives without losing access to memories of what came before.
AI, not so much. GPT-4 and other large language and multimodal models, which have taken the world by storm, are built using deep learning, a family of algorithms that loosely mimic the brain. The problem? “Deep learning systems with standard algorithms slowly lose the ability to learn,” Dr. Shibhansh Dohare at University of Alberta recently told Nature.
The reason for this is in how they’re set up and trained. Deep learning relies on multiple networks of artificial neurons that are connected to each other. Feeding data into the algorithms—say, reams of online resources like blogs, news articles, and YouTube and Reddit comments—changes the strength of these connections, so that the AI eventually “learns” patterns in the data and uses these patterns to churn out eloquent responses.
But these systems are basically brains frozen in time. Tackling a new task sometimes requires a whole new round of training and learning, which erases what came before and costs millions of dollars. For ChatGPT and other AI tools, this means they become increasingly outdated over time.
This week, Dohare and colleagues found a way to solve the problem. The key is to selectively reset some artificial neurons after a task, but without substantially changing the entire network—a bit like what happens in the brain as we sleep.
When tested with a continual visual learning task—say differentiating cats from houses or telling apart stop signs and school buses—deep learning algorithms equipped with selective resetting easily maintained high accuracy over 5,000 different tasks. Standard algorithms, in contrast, rapidly deteriorated, their success eventually dropping to about a coin-toss.
Called continual back propagation, the strategy is “among the first of a large and fast-growing set of methods” to deal with the continuous learning problem, wrote Drs. Clare Lyle and Razvan Pascanu at Google DeepMind, who were not involved in the study.
Machine Mind
Deep learning is one of the most popular ways to train AI. Inspired by the brain, these algorithms have layers of artificial neurons that connect to form artificial neural networks.
As an algorithm learns, some connections strengthen, while others dwindle. This process, called plasticity, mimics how the brain learns and optimizes artificial neural networks so they can deliver the best answer to a problem.
But deep learning algorithms aren’t as flexible as the brain. Once trained, their weights are stuck. Learning a new task reconfigures weights in existing networks—and in the process, the AI “forgets” previous experiences. It’s usually not a problem for typical uses like recognizing images or processing language (with the caveat that they can’t adapt to new data on the fly). But it’s highly problematic when training and using more sophisticated algorithms—for example, those that learn and respond to their environments like humans.
Using a classic gaming example, “a neural network can be trained to obtain a perfect score on the video game Pong, but training the same network to then play Space Invaders will cause its performance on Pong to drop considerably,” wrote Lyle and Pascanu.
Aptly called catastrophic forgetting, computer scientists have been battling the problem for years. An easy solution is to wipe the slate clean and retrain an AI for a new task from scratch, using a combination of old and new data. Although it recovers the AI’s abilities, the nuclear option also erases all previous knowledge. And while the strategy is doable for smaller AI models, it isn’t practical for huge ones, such as those that power large language models.
Back It Up
The new study adds to a foundational mechanism of deep learning, a process called back propagation. Simply put, back propagation provides feedback to the artificial neural network. Depending on how close the output is to the right answer, back propagation tweaks the algorithm’s internal connections until it learns the task at hand. With continuous learning, however, neural networks rapidly lose their plasticity, and they can no longer learn.
Here, the team took a first step toward solving the problem using a 1959 theory with the impressive name of “Selfridge’s Pandemonium.” The theory captures how we continuously process visual information and has heavily influenced AI for image recognition and other fields.
Using ImageNet, a classic repository of millions of images for AI training, the team established that standard deep learning models gradually lose their plasticity when challenged with thousands of sequential tasks. These are ridiculously simple for humans—differentiating cats from houses, for example, or stop signs from school buses.
With this measure, any drop in performance means the AI is gradually losing its learning ability. The deep learning algorithms were accurate up to 88 percent of the time in earlier tests. But by task 2,000, they’d lost plasticity and performance had fallen to near or below baseline.
The updated algorithm performed far better.
It still uses back propagation, but with a small difference. A tiny portion of artificial neurons are wiped clean during learning in every cycle. To prevent disrupting whole networks, only artificial neurons that are used less get reset. The upgrade allowed the algorithm to tackle up to 5,000 different image recognition tasks with over 90 percent accuracy throughout.
In another proof of concept, the team used the algorithm to drive a simulated ant-like robot across multiple terrains to see how quickly it could learn and adjust with feedback.
With continuous back propagation, the simulated critter easily navigated a video game road with variable friction—like hiking on sand, pavement, and rocks. The robot driven by the new algorithm soldiered on for at least 50 million steps. Those powered by standard algorithms crashed far earlier, with performance tanking to zero around 30 percent earlier.
The study is the latest to tackle deep learning’s plasticity problem.
A previous study found so-called dormant neurons—ones that no longer respond to signals from their network—make AI more rigid and reconfiguring them throughout training improved performance. But they’re not the entire story, wrote Lyle and Pascanu. AI networks that can no longer learn could also be due to network interactions that destabilize the way the AI learns. Scientists are still only scratching the surface of the phenomenon.
Meanwhile, for practical uses, when it comes to AIs, “you want them to keep with the times,” said Dohare. Continual learning isn’t just about telling apart cats from houses. It could also help self-driving cars better navigate new streets in changing weather or lighting conditions—especially in regions with microenvironments, where fog might rapidly shift to bright sunlight.
Tackling the problem “presents an exciting opportunity” that could lead to AI that retains past knowledge while learning new information and, like us humans, flexibly adapts to an ever-changing world. “These capabilities are crucial to the development of truly adaptive AI systems that can continue to train indefinitely, responding to changes in the world and learning new skills and abilities,” wrote Lyle and Pascanu.
Image Credit: Jaredd Craig / Unsplash