Instead of trying to produce a programme to simulate the adult mind, why not rather try to produce one which simulates the child’s? If this were then subjected to an appropriate course of education one would obtain the adult brain.
Alan Turing famously wrote this in his groundbreaking 1950 paper Computing Machinery and Intelligence, and laid the framework for generations of machine learning scientists to follow. Yet, despite increasingly impressive specialized applications and breathless predictions, we’re still some distance from programs that can simulate any mind, even one much less complex than a human’s.
Perhaps the key came in what Turing said next: “Our hope is that there is so little mechanism in the child brain that something like it can be easily programmed.” This seems, in hindsight, naive. Moravec’s paradox applies: things that seem like the height of human intellect, like a good stimulating game of chess, are easy for machines, while simple tasks can be extremely difficult. But if children are our template for the simplest general human-level intelligence we might program, then surely it makes sense for AI researchers to study the many millions of existing examples.
This is precisely what Professor Alison Gopnik and her team at Berkeley do. They seek to answer the question: how sophisticated are children as learners? Where are children still outperforming the best algorithms, and how do they do it?
General, Unsupervised Learning
Some of the answers were outlined in a recent talk at the International Conference on Machine Learning. The first and most obvious difference between four-year-olds and our best algorithms is that children are extremely good at generalizing from a small set of examples. ML algorithms are the opposite: they can extract structure from huge datasets that no human could ever process, but generally large amounts of training data are needed for good performance.
This training data usually has to be labeled, although unsupervised learning approaches are also making progress. In other words, there is often a strong “supervisory signal” coded into the algorithm and its dataset, consistently reinforcing the algorithm as it improves. Children can learn to perform generally on a wide variety of tasks with very little supervision, and they can generalize what they’ve learned to new situations they’ve never seen before.
Even in image recognition, where ML has made great strides, algorithms require a large set of images before they can confidently distinguish objects; children may only need one. How is this achieved?
Professor Gopnik and others argue that children have “abstract generative models” that explain how the world works. In other words, children have imagination: they can ask themselves abstract questions like “If I touch this sharp pin, what will happen?” And then, from very small datasets and experiences, they can anticipate the solution.
In doing so, they are correctly inferring the relationship between cause and effect from experience. Children know that the reason that this object will prick them unless handled with care is because it’s pointy, and not because it’s silver or because they found it in the kitchen. This may sound like common sense, but being able to make this kind of causal inference from small datasets is still hard for algorithms to do, especially across such a wide range of situations.
The Power of Imagination
Generative models are increasingly being employed by AI researchers—after all, the best way to show that you understand the structure and rules of a dataset is to produce examples that obey those rules. Such neural networks can compress hundreds of gigabytes of image data into hundreds of megabytes of statistical parameter weights and learn to produce images that look like the dataset. In this way, they “learn” something of the statistics of how the world works. But to do what children can and generalize with generative models is computationally infeasible, according to Gopnik.
This is far from the only trick children have up their sleeve which machine learning hopes to copy. Experiments from Professor Gopnik’s lab show that children have well-developed Bayesian reasoning abilities. Bayes’ theorem is all about assimilating new information into your assessment of what is likely to be true based on your prior knowledge. For example, finding an unfamiliar pair of underwear in your partner’s car might be a worrying sign—but if you know that they work in dry-cleaning and use the car to transport lost clothes, you might be less concerned.
Scientists at Berkeley present children with logical puzzles, such as machines that can be activated by placing different types of blocks or complicated toys that require a certain sequence of actions to light up and make music.
When they are given several examples (such as a small dataset of demonstrations of the toy), they can often infer the rules behind how the new system works from the age of three or four. These are Bayesian problems: the children efficiently assimilate the new information to help them understand the universal rules behind the toys. When the system isn’t explained, the children’s inherent curiosity leads them to experimenting with these systems—testing different combinations of actions and blocks—to quickly infer the rules behind how they work.
Indeed, it’s the curiosity of children that actually allows them to outperform adults in certain circumstances. When an incentive structure is introduced—i.e. “points” that can be gained and lost depending on your actions—adults tend to become conservative and risk-averse. Children are more concerned with understanding how the system works, and hence deploy riskier strategies. Curiosity may kill the cat, but in the right situation, it can allow children to win the game by identifying rules that adults miss by avoiding any action that might result in punishment.
To Explore or to Exploit?
This research shows not only the innate intelligence of children, but also touches on classic problems in algorithm design. The explore-exploit problem is well known in machine learning. Put simply, if you only have a certain amount of resources-time, computational ability, etc.—are you better off searching for new strategies, or simply taking the path that seems to most obviously lead to gains?
Children favor exploration over exploitation. This is how they learn—through play and experimentation with their surroundings, through keen observation and asking as many questions as they can. Children are social learners: as well as interacting with their environment, they learn from others. Anyone who has ever had to deal with a toddler endlessly using that favorite word, “why?”, will recognize this as a feature of how children learn! As we get older—kicking in around adolescence in Gopnik’s experiments—we switch to exploiting the strategies we’ve already learned rather than taking those risks.
These concepts are already being imitated in machine learning algorithms. One example is the idea of “temperature” for algorithms that look through possible solutions to a problem to find the best one. A high-temperature search is more likely to pick a random move that might initially take you further away from the reward. This means that the optimization is less likely to get “stuck” on a particular solution that’s hard to improve upon, but may not be the best out there—but it’s also slower to find a solution. Meanwhile, searches with lower temperature take fewer “risky” random moves and instead seek to refine what’s already been found.
In many ways, humans develop in the same way, from high-temperature toddlers who bounce around playing with new ideas and new solutions even when they seem strange to low-temperature adults who take fewer risks, are more methodical, but also less creative. This is how we try to program our machine learning algorithms to behave as well.
It’s nearly 70 years since Turing first suggested that we could create a general intelligence by simulating the mind of a child. The children he looked to for inspiration in 1950 are all knocking on the door of old age today. Yet, for all that machine learning and child psychology have developed over the years, there’s still a great deal that we don’t understand about how children can be such flexible, adaptive, and effective learners.
Understanding the learning process and the minds of children may help us to build better algorithms, but it could also help us to teach and nurture better and happier humans. Ultimately, isn’t that what technological progress is supposed to be about?