Last spring, in a race against Covid-19, 23andMe launched an ambitious study to answer a question on everyone’s minds: who’s likely to get sick, or to get very sick?
And being 23andMe, they hunted for a genetic factor.
We got answers last week. People with the O blood type—something determined by a gene called ABO—are less likely to test positive for Covid-19. Another part of the genome, packed with genes related to the immune system and other biological processes, could be why some people end up on ventilators, whereas others barely catch a cough.
The results are interesting. But it’s how 23andMe got to them that’s mind-blowing. In just one year, the study amassed over one million people to hunt for clues in our genes that could signal vulnerability to the infection. It’s a scale that scientists previously only dreamed of. When mining for genes related to a disease, “scale is the primary challenge,” said Dr. Adam Auton, who led the study.
In other words, consumer genetics is rapidly shifting expectations for these “gene hunting” studies, both in terms of size and speed. Ancestry, 23andMe’s competitor, similarly launched a research study on the genetics of Covid-19, with hundreds of thousands of people participating in the first two weeks. These massive datasets, especially those gathered from people of diverse ancestries, build up a powerful library of our genetic blueprint, in waiting to help tackle the next pandemic or other diseases.
What Did They Do?
The team relied on a big data analysis technique called GWAS, or genome-wide association studies. It’s a highly popular way to analyze genetics and hunt down parts of our DNA chains that are associated with a particular disease.
To explain a bit more: a gene can have single shifts in its letters, without wrecking its usual role. However, a DNA letter swap may have consequences—changing the color of your eyes, your blood type, your likelihood of getting Alzheimer’s or autism, how well you can fend off a viral infection, etc. These subtle changes to a gene, called SNPs (single nucleotide polymorphisms), are what 23andMe scan for when they analyze your DNA kit.
This is where GWAS comes in. It crunches data to see if DNA letter changes are associated with a disease. With enough data, GWAS can fish out sections of our genetic material that make us vulnerable to an infection. It’s an extremely powerful technique, but far from perfect. Similar to machine learning, GWAS is greedy in that it needs large datasets, preferably from people of multiple ethnicities and genetic backgrounds.
23andMe certainly has a massive database. In April 2020, as Covid-19 began sweeping the country, the 23andMe team followed the geographical flow of the virus to geo-target customers and recruit them into their Covid-19 study. The format was similar to the company’s previous research efforts, mainly with online questionnaires, but this time tailored to the pandemic.
The questions ranged from people’s demographics—their age, socioeconomic status, and ethnicity—to specific questions about Covid-19. For example, were they diagnosed, were they hospitalized, and how they assessed their breathing and other factors.
Overall, about 1.05 million 23andMe customers responded to the study, with over 15,000 people reporting they had been diagnosed with Covid-19 and 1,100 people hospitalized with the disease. The latter group was specifically added to the study, so that the data wouldn’t skew too much towards people who were able to answer the questionnaire—that is, those with relatively mild infections. This assures the study covers the whole spectrum of Covid-19 responses.
A Blood Type Link
Armed with millions of data points, the team asked if gene variants could change how easily a person gets infected with Covid-19 and how severely the disease progresses.
One especially strong link popped out: the gene that determines a person’s blood type. The analysis showed that the ABO gene strongly linked to the possibility that someone would test negative for Covid-19. A person’s blood type is determined by variations in a single gene. The team found that the O blood type was less likely to test positive for the infection than expected—suggesting, though not necessarily proving, that the blood type could be more protective against the disease.
It sounds like pseudoscience, but genes have a major impact on a person’s susceptibility to diseases, including the “big bads” like HIV-1. For example, one specific genetic type makes a person naturally immune to HIV-1 and AIDS. It’s reasonable to hypothesize that certain genetic types could be more or less susceptible to a new virus, too.
23andMe’s finding confirmed earlier work that suggested a blood type vulnerability to the disease. Back in June 2020, Dr. Tom Hemming Karlsen at Oslo University Hospital also found a protective effect of O blood types, but with a much smaller sample, and only with people from Italy and Spain. “They clarify further what our data could only vaguely hint at,” he said.
The study also identified a cohort of genes on chromosome 3 that correlate with a person’s Covid-19 severity—that is, are they likely to end up in a hospital and experience breathing problems. Several of the genes in this location are related to the immune system, and “work is underway to identify what, specifically, the mechanism of action is,” said study author Dr. Janie Shelton.
What especially stands out with the new study is its diversity. GWAS studies are often performed on people of European ancestry, and it’s still up in the air whether those results can be extrapolated to other populations, such as those of African or Asian descent. While the new study doesn’t yet reflect the diversity of the US, it has a “substantial” amount of people who identified as Black (nearly 3 percent) or Latino (about 11 percent). For context, representation of these two groups among the general US population is about 13 percent and 18 percent, respectively.
This allowed the team to look at similarities between groups of people with different ancestries. Blood type popped up in every ethnic group, hinting at an especially powerful genetic link regardless of background that scientists still need to understand. In addition, the team also looked at non-genetic risk factors for Covid-19, including gender, socioeconomic status, and obesity.
A New GWAS Future
This isn’t the first time 23andMe has made waves in research. Using a similar model of online surveys, the company has ventured into research on Parkinson’s disease, sleep, and breast cancer, to name a few. Consumer genetic companies are also partnering with academic institutions to share their datasets and methods to mine for new insights into how genetic differences contribute to health and disease.
“I applaud the consumer groups that are turning some of their resources to working on this, and everything will be valuable,” said Dr. Robert Green, a medical geneticist at Harvard and Brigham and Women’s Hospital to STAT in an earlier interview.
In the “data is power” age, consumer genetic companies are paving a new research road. But while their massive and more diverse datasets bolster GWAS, the next question remains the same. Why? Why do O blood types seem more resilient against Covid-19? It’s not just academic curiosity—for future pandemics or other disorders, answering the “why” and “how” is the next step in identifying a vulnerable population.
“We’d have to find out why [the gene variances are] significant—is it significant because it’s affecting blood clotting?” asked Dr. Jennifer Lighter at NYU Langone in reference to an earlier version of the paper. “Unless we find out why there’s a difference, we wouldn’t target therapies or [adjust] a risk category.”
For now, 23andMe is catching up to the pandemic. The blood link to Covid-19 susceptibility is intriguing, but doesn’t change treatments. As its database further grows, however, we could see a new GWAS landscape, powered by millions of data points, that hunts down culpable genes at breakneck speed—informing treatment decisions to save lives.