Explore Topics:
AIBiotechnologyRoboticsComputingFutureScienceSpaceEnergyTech
Artificial Intelligence

New Google AI Will Work Out What 98% of Our DNA Actually Does for the Body

AlphaGenome predicts how long stretches of DNA "dark matter" affect gene expression and a host of other important properties.

Edd Gent
Jul 03, 2025

Image Credit

MJH SHIKDER on Unsplash

Share

Vast swathes of the human genome remain a mystery to science. A new AI from Google DeepMind is helping researchers understand how these stretches of DNA impact the activity of other genes.

While the Human Genome Project produced a complete map of our DNA, we still know surprisingly little about what most of it does. Roughly 2 percent of the human genome encodes specific proteins, but the purpose of the other 98 percent is much less clear.

Historically, scientists called this part of the genome “junk DNA.” But there’s growing recognition these so-called “non-coding” regions play a critical role in regulating the expression of genes elsewhere in the genome.

Teasing out these interactions is a complicated business. But now a new Google DeepMind model called AlphaGenome can take long stretches of DNA and make predictions about how different genetic variants will affect gene expression, as well as a host of other important properties.

“We have, for the first time, created a single model that unifies many different challenges that come with understanding the genome,” Pushmeet Kohli, a vice president for research at DeepMind, told MIT Technology Review.

The so-called “sequence to function” model uses the same transformer architecture as the large language models behind popular AI chatbots. The model was trained on public databases of experimental results testing how different sequences impact gene regulation. Researchers can enter a DNA sequence of up to one million letters, and the model will then make predictions about a wide range of molecular properties impacting the sequence’s regulatory activity.

These include things like where genes start and end, which sections of the DNA are accessible or blocked by certain proteins, and how much RNA is being produced. RNA is the messenger molecule responsible for carrying the instructions contained in DNA to the cell’s protein factories, or ribosomes, as well as regulating gene expression.

AlphaGenome can also assess the impact of mutations in specific genes by comparing variants, and it can make predictions about RNA “splicing”—a process where RNA molecules are chopped up and packaged before being sent off to a ribosome. Errors in this process are responsible for rare genetic diseases, such as spinal muscular atrophy and some forms of cystic fibrosis.

Be Part of the Future

Sign up to receive top stories about groundbreaking technologies and visionary thinkers from SingularityHub.

100% Free. No Spam. Unsubscribe any time.

Predicting the impact of different genetic variants could be particularly useful. In a blog post, the DeepMind researchers report they used the model to predict how mutations other scientists had discovered in leukemia patients probably activated a nearby gene known to play a role in cancer.

“This system pushes us closer to a good first guess about what any variant will be doing when we observe it in a human,” Caleb Lareau, a computational biologist at Memorial Sloan Kettering Cancer Center granted early access to AlphaGenome, told MIT Technology Review.

The model will be free for noncommercial purposes, and DeepMind has committed to releasing full details of how it was built in the future. But it still has limitations. The company says the model can’t make predictions about the genomes of individuals, and its predictions don’t fully explain how genetic variations lead to complex traits or diseases. Further, it can’t accurately predict how non-coding DNA impacts genes that are located more than 100,000 letters away in the genome.

Anshul Kundaje, a computational genomicist at Stanford University in Palo Alto, California, who had early access to AlphaGenome, told Nature that the new model is an exciting development and significantly better than previous models, but not a slam dunk. “This model has not yet ‘solved’ gene regulation to the same extent as AlphaFold has, for example, protein 3D-structure prediction,” he says.

Nonetheless, the model is an important breakthrough in the effort to demystify the genome’s “dark matter.” It could transform our understanding of disease and supercharge synthetic biologists’ efforts to re-engineer DNA for our own purposes.

Edd is a freelance science and technology writer based in Bangalore, India. His main areas of interest are engineering, computing, and biology, with a particular focus on the intersections between the three.

Related Articles

A person on their phone

AI Might Now Be as Good as Humans at Detecting Emotion, Political Leaning, and Sarcasm

Ana Jovančević
A hand reaching toward the light

This AI Gives You Power Over Your Data

Edd Gent
CERN is using AI in its quest for new physics

The Dream of an AI Scientist Is Closer Than Ever

Edd Gent
A person on their phone
Artificial Intelligence

AI Might Now Be as Good as Humans at Detecting Emotion, Political Leaning, and Sarcasm

Ana Jovančević
A hand reaching toward the light
Artificial Intelligence

This AI Gives You Power Over Your Data

Edd Gent
CERN is using AI in its quest for new physics
Artificial Intelligence

The Dream of an AI Scientist Is Closer Than Ever

Edd Gent

What we’re reading

Be Part of the Future

Sign up to receive top stories about groundbreaking technologies and visionary thinkers from SingularityHub.

100% Free. No Spam. Unsubscribe any time.

SingularityHub chronicles the technological frontier with coverage of the breakthroughs, players, and issues shaping the future.

Follow Us On Social

About

  • About Hub
  • About Singularity

Get in Touch

  • Contact Us
  • Pitch Us
  • Brand Partnerships

Legal

  • Privacy Policy
  • Terms of Use
© 2025 Singularity