Ever seen a baby gazelle learn to walk? A fawn, which is basically a mammalian daddy longlegs, scrambles to its feet, falls, stands, and falls again. Eventually, it stands long enough to flail its toothpick-like legs into a series of near falls…ahem, steps. Amazingly, a few minutes after this endearing display, the fawn is hopping around like an old pro.
Well, now we have a robot version of this classic Serengeti scene.
The fawn in this case is a robotic dog at the University of California, Berkeley. And it’s likewise a surprisingly quick learner (relative to the rest of robot-kind). The robot is also special because, unlike other flashier robots you might have seen online, it uses artificial intelligence to teach itself how to walk.
Beginning on its back, legs waving, the robot learns to flip itself over, stand up, and walk in an hour. A further ten minutes of harassment with a roll of cardboard is enough to teach it how to withstand and recover from being pushed around by its handlers.
It’s not the first time a robot has used artificial intelligence to learn to walk. But while prior robots learned the skill by trial and error over innumerable iterations in simulations, the Berkeley bot learned entirely in the real world.
In a paper published on the arXiv preprint server, the researchers—Danijar Hafner, Alejandro Escontrela, and Philipp Wu—say transferring algorithms that have learned in simulation to the real world isn’t straightforward. Little details and differences between the real world and simulation can trip up fledgling robots. On the other hand, training algorithms in the real world is impractical: It’d take too much time and wear and tear.
Four years ago, for example, OpenAI showed off an AI-enabled robotic hand that could manipulate a cube. The control algorithm, Dactyl, needed some 100 years’ worth of experience in a simulation powered by 6,144 CPUs and 8 Nvidia V100 GPUs to accomplish this relatively simple task. Things have advanced since then, but the problem largely remains. Pure reinforcement learning algorithms need too much trial and error to learn skills for them to train in the real world. Simply put, the learning process would break researchers and robots before making any meaningful progress.
The Berkeley team set out to solve this problem with an algorithm called Dreamer. Constructing what’s called a “world model,” Dreamer can project the probability a future action will achieve its goal. With experience, the accuracy of its projections improve. By filtering out less successful actions in advance, the world model allows the robot to more efficiently figure out what works.
“Learning world models from past experience enables robots to imagine the future outcomes of potential actions, reducing the amount of trial and error in the real environment needed to learn successful behaviors,” the researchers write. “By predicting future outcomes, world models allow for planning and behavior learning given only small amounts of real world interaction.”
In other words, a world model can reduce the equivalent of years of training time in a simulation to no more than an awkward hour in the real world.
The approach may have wider relevance than robot dogs too. The team also applied Dreamer to a pick-and-place robotic arm and a wheeled robot. In both cases, they found Dreamer allowed their robots to efficiently learn relevant skills, no sim time required. More ambitious future applications might include self-driving cars.
Of course, there are still challenges to address. Although reinforcement learning automates some of the intricate hand-coding behind today’s most advanced robots, it does still require engineers to define a robot’s goals and what constitutes success—an exercise that is both time consuming and open-ended for real-world environments. Also, though the robot survived the team’s experiments here, longer training on more advanced skills may prove too much for future bots to survive without damage. The researchers say it might be fruitful to combine simulator training with fast real-world learning.
Still, the results advance AI in robotics another step. Dreamer strengthens the case that “reinforcement learning will be a cornerstone tool in the future of robot control,” Jonathan Hurst, a professor of robotics at Oregon State University told MIT Technology Review.
Image Credit: Danijar Hafner / YouTube