Designing Robots That Learn as Effortlessly as Babies

A wide-eyed, rosy-cheeked, babbling human baby hardly looks like the ultimate learning machine.

But under the hood, an 18-month-old can outlearn any state-of-the-art artificial intelligence algorithm.

Their secret sauce?

They watch; they imitate; and they extrapolate.

Artificial intelligence researchers have begun to take notice. This week, two separate teams dipped their toes into cognitive psychology and developed new algorithms that teach machines to learn like babies. One instructs computers to imitate; the other, to extrapolate.

When challenged to write letters in a foreign alphabet after given only a few examples — a “one-shot” learning task — the new approach outperformed leading AI algorithms by far.

“Humans are the most imitative creature on the planet and young kids seamlessly intertwine imitation and innovation,” said Dr. Andrew Meltzoff, a psychology professor at the University of Washington who directed one of the studies.

Why not design robots that learn as effortlessly as a child, he asked.

Baby Babble and Machine Trouble

To rapidly make sense of a bustling — even chaotic — world, babies instinctively learn by observing elements of that world. The discrete pieces of information — faces, places, objects — are then rapidly transformed into concepts, forming a flexible framework that lets babies ask questions beginning with “what.”

They’re crazy efficient, explained Dr. Brenden Lake to Science Magazine, who co-authored one of the studies. For example, when babies start learning their native language, they often need only a few examples to grasp the basics of a chair, a hairbrush or a scooter. In a way, they extrapolate the essence of the word, and acquire the ability to use the word and classify the object mostly correctly.

But babies aren’t just observers. Before the age of two, they’re asking an even harder question: “Why?” That is, they become extremely adept at understanding people’s intentions. Subconsciously, their brains flood with questions. What’s the person trying to do? How are they going about it? Can I get to the same goal differently?

By watching others, babies pick up essential skills, social rules and laws of nature. They then categorize them into concepts and combine these building blocks in new ways to invent new solutions.

In contrast, even high-performance machine learning models are, well, mechanical.

We don’t code specific sets of rules — for example, what makes a cat a cat — to teach machines, said Lake. Instead, the best approach currently is to provide the computer with thousands of examples and let the algorithm figure out the best solution.

Despite genuine progress in recent years, for most concepts in our natural world machines still fall short, especially when given only a few examples, said Lake. What’s more, unlike people, machines can’t generalize what they’ve learned — they can’t apply the knowledge to new problems.

Think back to when Segways first came out, explained Lake. It took maybe a few looks for you to figure out its general gist; but you didn’t stop there. You also understood that the handlebars connected to the wheels, and they were powered by motors, and there’s a stand for your feet. Using this knowledge, you can sketch your version of a Segway, or a different vehicle inspired by the Segway.

Machines can’t. And that’s what Meltzoff, Lake and their respective teams are trying to fix.

Imitation Game

Meltzoff, in collaboration with Dr. Rajesh Rao at the University of Washington, tackled the first steps of baby learning — observation and imitation.

The team developed an algorithm that allows a robot to consider how its actions can lead to different outcomes. Then the robot built a probabilistic model that it uses to interpret the actions and intentions of its human handler. When unsure, the robot had the ability to ask humans for help.

The team tested their new algorithm with two tasks, both mimicking previous experiments that Meltzoff had done with human babies. In one, the robot learned to follow the gaze of the experimenter — a task far harder than it sounds.

The robot has to learn how its own head moves, and understand that the human handler’s head can move according to the same rules, explained Meltzoff. It then tracked the human’s head movements in space and time, and inferred from the person’s gaze what he was looking at. The robot then models its own actions based on that information and fixates at the same location as the handler.

When Meltzoff blindfolded the handler, the robot gradually inferred that the person could no longer see, and stopped following his gaze.

In the second experiment, the robot played around with food shaped toys with its grippers. It then had to mimic a human who pushed the toys around. Rather than rigidly following its mentor’s every minute movement, however, the robot developed its own way to get to the same goal.

For example, compared to pushing a toy, it’s more practical for the robot to use its grippers to pick it up, move it and let it go, said lead author Micheal Chung.

But that’s incredibly high-level.

“The robot has to know what the goal is, which is a tough problem and what we tackled in this study,” said Chung.

Capturing Concepts

In contrast to Meltzoff, Lake and his colleagues worked on the next steps of learning: building concepts and extrapolating from them.

Specifically, the team developed a new computational model that allowed machines to acquire and apply a wide range of new visual concepts based on just a few examples.

Traditional machine learning algorithms take a statistical approach — they treat concepts as patterns of pixels or configurations of features. The learning problem is really about finding patterns and features, explained Lake.

Lake’s algorithm treats concepts as models of the world, much akin to babies. This way, “learning becomes a process of model-building, or explaining the data,” said Lake.

The team used a very specific task: They trained the algorithm on a set of pen strokes that are used to form hand-written characters from languages around the world (for example, Tibetan pictographs). Certain sequences of pen strokes — often used together to form a character (say, “A”) — are grouped as a concept.

The algorithm automatically codes each concept as a computer program — a “probabilistic model.” When researchers ran the program, it generated a set of virtual pen strokes that formed the concept. For example, one program could write the letter “A” with a series of pen strokes, similar to how people would draw the letter.

Next, the team challenged the algorithm with characters that it had never seen before, and asked it to guess — stroke by stroke — how the character was written. As an even harder task, the algorithm had to produce a character that looked like it might belong to a given alphabet.

In both tasks, the algorithm did just as well as human volunteers given the same amount of training, and outperformed AI approaches that used deep learning.

“If we want to develop algorithms that not just understand but also predict data, we have to look at and be inspired by the best example of intelligence that we have — human beings,” said Lake.

Both studies are an early step towards that goal.

We identified several key ingredients that really boost learning in our algorithm, which scientists also see in human infants, said Lake. One is being able to understand causality, which represents how something came to be. In our case with letters, it’s the process of writing.

Another one is “learning to learn,” which is the idea that knowledge or previous learning from related concepts can help learning of new concepts.

Meltzoff agrees. By emulating human development, he believes that robots will be able to learn increasingly sophisticated tasks by observing, imitating and extrapolating from other humans and robots.

Mimicking human development is likely not the only way to build intelligent machines. It’s possibly not even the best way. But it has groundbreaking potential.

Bringing together AI scientists and developmental psychologists may let us combine the best of human learning and the best of machine learning — and eventually, benefit both, Meltzoff said.

Image Credit:

Shelly Fan
Shelly Fan
Shelly Xuelai Fan is a neuroscientist-turned-science writer. She completed her PhD in neuroscience at the University of British Columbia, where she developed novel treatments for neurodegeneration. While studying biological brains, she became fascinated with AI and all things biotech. Following graduation, she moved to UCSF to study blood-based factors that rejuvenate aged brains. She is the co-founder of Vantastic Media, a media venture that explores science stories through text and video, and runs the award-winning blog Her first book, "Will AI Replace Us?" (Thames & Hudson) was published in 2019.
Don't miss a trend
Get Hub delivered to your inbox