Google DeepMind’s New AI Matches Gold Medal Performance in Math Olympics

After cracking an unsolvable mathematics problem last year, AI is back to tackle geometry.

Developed by Google DeepMind, a new algorithm, AlphaGeometry, can crush problems from past International Mathematical Olympiads—a top-level competition for high schoolers—and matches the performance of previous gold medalists.

When challenged with 30 difficult geometry problems, the AI successfully solved 25 within the standard allotted time, beating previous state-of-the-art algorithms by 15 answers.

While often considered the bane of high school math class, geometry is embedded in our everyday life. Art, astronomy, interior design, and architecture all rely on geometry. So do navigation, maps, and route planning. At its core, geometry is a way to describe space, shapes, and distances using logical reasoning.

In a way, solving geometry problems is a bit like playing chess. Given some rules—called theorems and proofs—there’s a limited number of solutions to each step, but finding which one makes sense relies on flexible reasoning conforming to stringent mathematical rules.

In other words, tackling geometry requires both creativity and structure. While humans develop these mental acrobatic skills through years of practice, AI has always struggled.

AlphaGeometry cleverly combines both features into a single system. It has two main components: A rule-bound logical model that attempts to find an answer, and a large language model to generate out-of-the-box ideas. If the AI fails to find a solution based on logical reasoning alone, the language model kicks in to provide new angles. The result is an AI with both creativity and reasoning skills that can explain its solution.

The system is DeepMind’s latest foray into solving mathematical problems with machine intelligence. But their eyes are on a larger prize. AlphaGeometry is built for logical reasoning in complex environments—such as our chaotic everyday world. Beyond mathematics, future iterations could potentially help scientists find solutions in other complicated systems, such as deciphering brain connections or unraveling genetic webs that lead to disease.

“We’re making a big jump, a big breakthrough in terms of the result,” study author Dr. Trieu Trinh told the New York Times.

Double Team

A quick geometry question: Picture a triangle with both sides equal in length. How do you prove the bottom two angles are exactly the same?

This is one of the first challenges AlphaGeometry faced. To solve it, you need to fully grasp rules in geometry but also have creativity to inch towards the answer.

“Proving theorems showcases the mastery of logical reasoning…signifying a remarkable problem-solving skill,” the team wrote in research published today in Nature.

Here’s where AlphaGeometry’s architecture excels. Dubbed a neuro-symbolic system, it first tackles a problem with its symbolic deduction engine. Imagine these algorithms as a grade A student that strictly studies math textbooks and follows rules. They’re guided by logic and can easily lay out every step leading to a solution—like explaining a line of reasoning in a math test.

These systems are old school but incredibly powerful, in that they don’t have the “black box” problem that haunts much of modern deep learning algorithms.

Deep learning has reshaped our world. But due to how these algorithms work, they often can’t explain their output. This just won’t do when it comes to math, which relies on stringent logical reasoning that can be written down.

Symbolic deduction engines counteract the black box problem in that they’re rational and explainable. But faced with complex problems, they’re slow and struggle to flexibly adapt.

Here’s where large language models come in. The driving force behind ChatGPT, these algorithms are excellent at finding patterns in complicated data and generating new solutions, if there’s enough training data. But they often lack the ability to explain themselves, making it necessary to double check their results.

AlphaGeometry combines the best of both worlds.

When faced with a geometry problem, the symbolic deduction engine gives it a go first. Take the triangle problem. The algorithm “understands” the premise of the question, in that it needs to prove the bottom two angles are the same. The language model then suggests drawing a new line from the top of the triangle straight down to the bottom to help solve the problem. Each new element that moves the AI towards the solution is dubbed a “construct.”

The symbolic deduction engine takes the advice and writes down the logic behind its reasoning. If the construct doesn’t work, the two systems go through multiple rounds of deliberation until AlphaGeometry reaches the solution.

The whole setup is “akin to the idea of ‘thinking, fast and slow,’” wrote the team on DeepMind’s blog. “One system provides fast, ‘intuitive’ ideas, and the other, more deliberate, rational decision-making.”

We Are the Champions

Unlike text or audio files, there’s a dearth of examples focused on geometry, which made it difficult to train AlphaGeometry.

As a workaround, the team generated their own dataset featuring 100 million synthetic examples of random geometric shapes and mapped relationships between points and lines—similar to how you solve geometry in math class, but at a far larger scale.

From there, the AI grasped rules of geometry and learned to work backwards from the solution to figure out if it needed to add any constructs. This cycle allowed the AI to learn from scratch without any human input.

Putting the AI to the test, the team challenged it with 30 Olympiad problems from over a decade of previous competitions. The generated results were evaluated by a previous Olympiad gold medalist, Evan Chen, to ensure their quality.

In all, the AI matched the performance of past gold medalists, completing 25 problems within the time limit. The previous state-of-the-art result was 10 correct answers.

“AlphaGeometry’s output is impressive because it’s both verifiable and clean,” Chen said. “It uses classical geometry rules with angles and similar triangles just as students do.”

Beyond Math

AlphaGeometry is DeepMind’s latest foray into mathematics. In 2021, their AI cracked mathematical puzzles that had stumped humans for decades. More recently, they used large language models to reason STEM problems at the college level and cracked a previously “unsolvable” math problem based on a card game with the algorithm FunSearch.

For now, AlphaGeometry is tailored to geometry, and with caveats. Much of geometry is visual, but the system can’t “see” the drawings, which could expedite problem solving. Adding images, perhaps with Google’s Gemini AI, launched late last year, may bolster its geometric smarts.

A similar strategy could also expand AlphaGeometry’s reach to a wide range of scientific domains that require stringent reasoning with a touch of creativity. (Let’s be real—it’s all of them.)

“Given the wider potential of training AI systems from scratch with large-scale synthetic data, this approach could shape how the AI systems of the future discover new knowledge, in math and beyond,” wrote the team.

Image Credit: Joel Filipe / Unsplash 

Shelly Fan
Shelly Fan
Shelly Xuelai Fan is a neuroscientist-turned-science writer. She completed her PhD in neuroscience at the University of British Columbia, where she developed novel treatments for neurodegeneration. While studying biological brains, she became fascinated with AI and all things biotech. Following graduation, she moved to UCSF to study blood-based factors that rejuvenate aged brains. She is the co-founder of Vantastic Media, a media venture that explores science stories through text and video, and runs the award-winning blog Her first book, "Will AI Replace Us?" (Thames & Hudson) was published in 2019.
Don't miss a trend
Get Hub delivered to your inbox