There’s a kindergarten I walk past on the way to work, and I can’t help but peek inside everyday. The classroom — packed with toys and puzzles, music and books, flower planters and even an occasional cat — was obviously crafted to be a rich and bustling world for kids to interact and play in.
Contrary to its meaning, child’s play is far from simple. It’s not just about having fun; it’s a process of learning, of gaining an understanding of the world. Playing in a diverse, exciting universe is how we nurture a child’s budding intelligence.
So why shouldn’t AI learn in the same way?
Recently, the non-profit institute OpenAI unveiled a virtual world for AI to explore and play in. Dubbed Universe, the goals of the project are as vast as its name: to train a single AI to be proficient at any task a human can do with a computer.
By teaching individual AI “agents” to excel at a variety of real-world tasks, OpenAI hopes to lead us one step closer to truly intelligent bots — those equipped with the flexible reasoning skills we humans possess.
There’s no doubt AIs are getting scarily smart.
Computers can now accurately see, hear and translate languages, sometimes even outperforming humans. Just earlier this year, in a series of high-profile Go games, Google DeepMind’s AlphaGo defeated the 18-time world champion Lee Sedol in an astonishing victory, a decade earlier than some experts expected.
But the truth is, AIs are still only good at what they’re trained to do. Ask AlphaGo to play chess, and the program likely returns the machine equivalent of utter bewilderment, even after you explain the rules in great detail.
As of now, our AI systems are ultra-efficient one-trick ponies. The way they’re trained is partly at fault: researchers generally initialize a blank slate AI, put it through millions of trials until it masters one task and call it quits. The AI never experiences anything else, so how would it know how to solve any other problem?
To get to general intelligence — a human-like ability to use previous experiences to tackle new problems — AIs need to carry their experiences into a variety of new tasks. This is where Universe comes in. By experiencing a world full of different scenarios, OpenAI researchers reason, AIs may finally be able to develop world knowledge and flexible problem solving skills that allow them to “think,” rather than forever getting stuck in a singular loop.
A whole new world
In a nutshell, Universe is a powerful platform encompassing thousands of environments that provides a common way for researchers to train their AI agent.
As a software platform Universe provides a stage to run other software, and each of these programs contributes a different environment — Atari and flash games, apps and websites, for example, are already applicable.
There’s likely more to come.
In theory, Universe can run any software on any computer, allowing researchers to plug-and-train their AI of choice. It’s like taking a kid to a multi-activity summer camp: pick your favorite niece, select an activity, wait until she masters it, pick another activity, rinse and repeat.
Within Universe, the AI interacts with the virtual world like humans use a computer: it “sees” screen pixels and uses a virtual keyboard and mouse to give out commands.
This is made possible through Virtual Network Computing (VNC), which is basically a desktop sharing system that can transmit keyboard and mouse movements from one computer (the AI) to another (the training environment). When the environment changes, the VNC sends updated screenshots back to the AI, allowing it to execute its next step. In other words, the VNC acts like the AI’s eyes and hands.
So how does learning happen?
All AIs plugged into Universe are trained through reinforcement learning, a powerful technology that was behind AlphaGo’s success. The tech, in reality, is just a fancy term for how we train dolphins, dogs, and (may I say) kids. It’s learning through trial-and-error: choose an action, and if you get rewarded for it, keep doing it. Otherwise, try something else.
Rather than starting with a completely blank AI, researchers sometimes give them a boost by letting them “watch” how humans solve a task. This allows the AI to form an initial impression and have a better idea of how to further optimize its solutions.
Reinforcement learning is already used in many AI applications. Inside Universe, however, the power of this tech really shines. Because AIs can hop between games and apps, it can take what it has learned in one app and readily use it to crack another — something dubbed “transfer learning.” It’s a tough skill to master, but a necessary one on the road to intelligent machines.
And according to OpenAI, we’re slowly getting there: some of their agents already show signs of transferring some learning from one driving game to another.
From games to the world of bits
Like many other AI developers, OpenAI used games to kick off Universe for a reason: they’re easy to benchmark. Because games are measured by a variety of stats and scores, the system can easily use these numbers to gauge an AI’s process and reward the agent accordingly — something absolutely critical for reinforcement learning.
Because Universe relies on pixels and keyboards, humans are also able to play games on the platform. These sessions are recorded and provide a meaningful baseline to judge AI performances (you can even sign up for the job!).
But games only form a sliver of our interactions with the digital world, and Universe is already expanding beyond their limitations with a project dubbed the Mini World of Bits. Bits is a collection of different web browser interactions we encounter while browsing the Internet: typing things into text boxes or selecting an option from a dropdown menu and clicking “submit.”
These tasks, although simple, form the foundation of how we tap into the treasure trove that is the web. Ultimately OpenAI envisions AIs that can fluidly navigate the web towards a goal — for example, booking a flight. In one of Universe’s environments, researchers can already give an AI a desired booking schedule and train it to browse for the flight on multiple airlines.
And that’s just the beginning.
Universe is only set to grow larger. Microsoft’s Malmo, an AI platform that uses Minecraft as its testing ground, is just about to integrate into Universe, as are the popular protein folding game fold.it, Android apps, HTML5 games and “really anything else people think of.”
Ghost in the machine
So we can now train AI to play multiple games and browse the web. Big deal. Will that really get us to general intelligence?
Maybe. And it’ll be a long road.
But an AI that knows how to win any game you throw at it already knows how to think logically, in multiple steps, towards victory. An AI that can navigate the chaotic world of Grand Theft Auto V already understands the basics of real-world physics, and perhaps even violence and retaliation. An AI that can surf the web has already knows how humans normally talk to each other and can use that knowledge to gain information, set up its own web identity, or maybe even peek into yours.
Every day, we learn, play, work and grow inside the digital realm. To many, the world of 0s and 1s is just as real as the natural one we’re born in.
Now that AIs have access to that digital world, it’s their turn to grow — let’s see how far they can go.
Banner Image Credit: Shutterstock