DeepMind’s Protein Folding AI Is Going After Coronavirus

Shelly Fan

Mar 17, 2020

In late December last year, Dr. Li Wenliang began warning officials about a novel coronavirus in Wuhan, China, but was silenced by the police before tragically succumbing to the disease two months later. Meanwhile, almost simultaneously, a computer server halfway across the world started issuing worrying alerts of a potential new outbreak. The server runs software by BlueDot, a company based in San Francisco that uses AI to monitor infectious disease outbreaks for signs of early trouble.

Not enough people listened to either human expertise or AI. Then cases skyrocketed in Wuhan and spread across the world, and people had to take note.

Hindsight is 20/20, but it is remarkable that BlueDot and other machine learning-based services are beginning to catch early signs of infectious disease outbreaks—almost within the same time frame as health experts, if just for COVID-19. We often hear about AI as the next second coming of healthcare, where it can catch cases early, accelerate drug development, and personalize treatment. Yet COVID-19 is the first global pandemic to ever hold healthcare AI’s feet to the flame in a global, serious, and urgent real-world test case. In a head-to-head race, can AI actually accelerate new anti-virals or vaccines for COVID-19, something the world has never previously seen? Or will traditional biotech measures excel, in turn unveiling that AI’s hype massively outstrips reality?

MIT Technology Review recently reported an excellent piece that comprehensively looks at how AI—at its current ability level—can help us predict, diagnose, and treat novel viral threats. I’m on board with the general idea: AI’s potential is enormous.

Yet for now, don’t look to AI to help tackle COVID-19; it’s simply not ready.

That said, it is enormously helpful to see how major machine learning companies are utilizing or repositioning their technologies for tackling the crisis. People often critique AI tested in “toy cases,” or standardized, limited datasets that may have limited significance in the real world. With companies working on COVID-19, that’s no longer the case.

Ready, player, go? Here’s how one major AI player in healthtech, DeepMind, is trying to knee-cap COVID-19.

AI In “Invisible Man” Prediction

The promise of AI for accelerating medical drug discovery is almost a universally supported idea. One caveat: so far, though new drugs have been discovered using AI, no AI-based drug candidate has made it through the approval process (yet), or even demonstrated that the tech makes the whole process faster to market (yet).

In very broad strokes, AI could be enormously helpful for initial drug discovery in two main ways: one, screening through millions of chemical compounds for potential drugs in simulation tests, far faster than any human expert; two, identifying targets that new drugs can latch onto, either to reduce their impact (making people less sick), or to slow their spread among people.

For COVID-19, DeepMind is focusing on the second route. Known mostly for its algorithms that beat human players at Go, DOTA, and other games, DeepMind has nevertheless been working directly on solutions for drug discovery. Their secret sauce? AlphaFold, a deep learning system that tries to predict protein structures accurately when no similar proteins exist.

AlphaGo? Fold? Collab?

Stay with me. How a protein “looks” in 3D is essential for developing new drugs, especially for new viruses. COVID-19, for example, has really spikey proteins that jut out from its surface. Normally, human cells don’t care—they won’t let the virus inside. But COVID-19’s spikey proteins also harbor a Trojan Horse that “activates” it in certain cells with a complementary component. Lung cells have an abundance of these factors, which is why they’re susceptible to invasion.

Bottom line: if a drug is going to “fit” into a protein like a key into a lock to trigger a whole cascade of nasty reactions, then the first step is to figure out the structure of the lock. That’s what DeepMind’s AlphaFold is doing.

Thanks to a surge of global collaboration, China released the genomic blueprint of the COVID-19 virus in open-access databases, whereas others have posted online the structure of some of its proteins—either determined by experiments or through computational modeling. DeepMind is taking these data to the next level by focusing on a few understudied but potentially important proteins that could become drug or vaccine targets using machine learning.

Protein folding has been a decades-long, fundamental problem in biochemistry and drug discovery. Almost all of our existing drugs grab onto certain proteins to work, so identifying protein structure is akin to surveying the enemy landscape and figuring out best attack point simultaneously. The problem is the genetic code doesn’t translate to how proteins look. When it comes to a new virus, without predicting protein structures we’re basically fighting viruses and diseases as if they were the Invisible Man.

Traditional methods use high-tech microscopes, freezing proteins into crystal-looking entities, and other strange and expensive ways to understand their structure. Under the scope, a protein is basically a chain of chemical “letters” that wrap around itself into intricate structures—kinda like how your headphones always tangle into inconceivable structures while you’re sleeping. For DeepMind and other protein-folding efforts, the key is to predict—and then find methods to decipher drug targets from—those structures.

Be Part of the Future

100% Free. No Spam. Unsubscribe any time.

AlphaFold stands out as a union of decades of deep learning progress, but guided by expertise from protein structure databases in the public domain. In a nutshell, AlphaFold uses genome sequences (available for COVID-19 and relatively easy to get) to predict the properties of resulting proteins that actually do the work, by looking at the “distance” of each “letter” or component that makes up a certain protein. It doesn’t predict specific sequences with special powers—such as those that bind to a cell—but offers a quick police sketch of the virus perp in sight.

There’s no doubt that AlphaFold is new to the protein-folding game. Even DeepMind itself stresses that “these structure predictions have not been experimentally verified,” but could galvanize efforts at making anti-virals and/or vaccines. For now, it’s difficult to judge how much AlphaFold will contribute to the pandemic, if at all. But by automating a critical aspect of drug discovery, it’s also en route to becoming a much larger player in the next epidemic.

Of note: all of this would not be possible without public, open-source databases of protein structures (like UniProt and the Protein Data Bank) that’s been building for decades. DeepMind’s release, posted with open access, has been lauded by fellow scientists as a way of giving back to the community.

Other Players

China’s long-time Google surrogate and AI behemoth, Baidu, is using an algorithm to predict the structure of another important biomolecule, mRNA. mRNA shuttles information from the genome to protein factories, so shoot the mRNA messenger, then the viral proteins are never born. Similarly, AI could one day potentially predict epidemics and how a virus changes over time—but it will only help if there’s enough trust to listen to the models.

Various AI companies are also making a play towards efficient diagnostics—identifying COVID-19 signs in medical scans—or other measures to support at-risk and overworked medical frontline heroes. The problem is that with any new outbreak, we don’t have enough data to train an AI, which means that they will struggle to find subtle differences in imperfect medical scans, at least for now.

So, is AI our savior? Not in this pandemic. Similar to the 2003 SARS outbreak, the best response is something that has existed for centuries: social distancing. As I mentioned previously, before COVID-19 exploded into a pandemic, science was ready to provide answers for COVID-19 as long as governments were also ready to respond. And because AI is based on scientific data and helping otherwise difficult efforts, machine learning is rapidly learning to do the same.

But perhaps ironically, COVID-19 is exposing both the best and weakest parts of AI in our current society for healthcare: great models that in theory should work, solid predictions that can be tested, but not without any recommendations without a heavy dose of skepticism. COVID-19 presents a brutal test case for AI in healthcare.

But for now, the toughest case is that of government management and what we do in response.

Note: To learn more about the Covid-19 pandemic, tune into Singularity University’s free virtual summit: Covid-19: The State & Future of Pandemics.

Image Credit: Vektor Kunst from Pixabay

Health

Shelly Fan

Dr. Shelly Xuelai Fan is a neuroscientist-turned-science-writer. She's fascinated with research about the brain, AI, longevity, biotech, and especially their intersection. As a digital nomad, she enjoys exploring new cultures, local foods, and the great outdoors.