Trial and error is one of the most fundamental learning strategies employed by animals, and we’re increasingly using it to teach intelligent machines too. Boosting the flow of ideas between biologists and computer scientists studying the approach could solve mysteries in animal cognition and help develop powerful new algorithms, say researchers.

Some of the most exciting recent developments in AI, in particular those coming out of Google DeepMind, have relied heavily on reinforcement learning. This refers to a machine learning approach in which agents learn to use feedback from their environment to choose actions that maximize rewards.

Much of the inspiration for the earliest reinforcement learning algorithms came from rules developed to describe the learning behavior of animals, and the deep neural networks more recent approaches rely on also have roots in biology. But despite the common heritage, research in the two fields has diverged.

While biologists often study simple learning problems with fairly immediate connections between choices and rewards, their subjects are often operating in highly dynamic environments where adaptability and continuous learning are important.

In contrast, machine reinforcement learning has been employed to carry out more complex tasks, but algorithms are generally designed to solve a single problem in a highly-controlled environment. That makes them relatively inflexible, and their statistical approach to learning is time-consuming. It requires the training and operation phases to be separated, so they can’t learn on the job.

But in many ways, this divergence means it’s even more important that the two fields share notes. “Despite differences between work on learning in biological and artificial agents, or perhaps due to these differences, there is much room for the flow of ideas between these fields,” say the authors of a recent paper in Nature.

This kind of exchange has already proved productive, say the researchers. The concept of model-free and model-based reinforcement learning developed in machine learning has already helped systems neuroscientists develop a much richer understanding of reward-based learning processes in animals (the former refers to pure trial-and-error learning, while the latter refers to building a statistical model of a problem or environment that can speed up learning).

Model-free theories have been particularly effective at explaining the neural basis of reinforcement learning in mammalian brains, in particular the activity of neurons that release the brain’s reward chemical, dopamine, and how that affects behavior. Researchers have found that there are parallel neural systems that learn at different rates and also make it possible for the brain to solve learning problems on multiple timescales.

There’s also behavioral evidence that animal brains use model-based learning, though the neural underpinnings are so far unclear. Examples include learning to learn, which refers to the ability to draw on experience from similar problems to solve related ones more quickly. Animals can also use those experiences to build what are essentially statistical models that help them make predictions about the problem.

The discovery that animals use multiple learning systems each optimized to different environments is a valuable insight for those looking to develop machine reinforcement learning, say the authors. It suggests that artificial reinforcement learning agents will also have to combine both model-free and model-based components if they want to approach the capabilities and efficiencies of the human brain.

What remains a mystery is how animals, and in particular humans, learn to coordinate complex behavior that can include actions as varied in difficulty as activating a specific muscle to making long-term strategic decisions. Here again, advances in machine reinforcement learning could help provide insight, say the authors.

An emerging approach known as hierarchical reinforcement learning lumps low-level actions together into hierarchically organized sub-goals, which can be learned more efficiently. Little work has been done so far on how animals solve these kinds of problems, so these models could be useful jumping-off points for biologists.

According to the study, one area that might accelerate cross pollination between the two fields is neuromorphic computing, which aims to create processors that more closely mimic the physical structure of the brain. This could provide a powerful new tool for designing and testing brain-inspired algorithms, both to test theories of animal cognition and to find powerful solutions to real-world problems.

Image Credit: Lissi Lyngsoe /

I am a freelance science and technology writer based in Bangalore, India. My main areas of interest are engineering, computing and biology, with a particular focus on the intersections between the three.

Follow Edd: