The sequencing of the human genome was one of the greatest scientific feats of the past century, but it’s a little-known fact that it’s still a work in progress with considerable gaps. New research suggests we could be just months away from finally finishing the job.
Nearly two decades after the Human Genome Project released the first map of our DNA, there are still large sections that are a mystery to us. Scientists have been slowly filling in the gaps, but certain portions that feature repetitive sequences going on for millions of base pairs have long been seen as intractable.
That’s because most common gene sequencing technologies create short snippets of DNA that then have to be stitched together. When applied to these highly repetitive sections it becomes almost impossible to distinguish the pieces, so putting them back together in the right order is extremely difficult.
“Imagine having to reconstruct a jigsaw puzzle,” senior author Adam Phillippy, from the National Human Genome Research Institute (NHGRI), said in a press release. “If you are working with smaller pieces, each contains less context for figuring out where it came from, especially in parts of the puzzle without any unique clues, like a blue sky. The same is true for sequencing the human genome. Until now, the pieces were too small, and there was no way to put the hardest parts of the genome puzzle together.”
But that is now changing. In a paper published last week in Nature, the researchers describe how they used a cutting-edge approach known as nanopore sequencing to tackle some of the previously inscrutable sections of the genome and produce the first-ever gapless sequence of the X chromosome.
Nanopore sequencing works by passing DNA molecules through a tiny hole and measuring the change in an electric current running across the hole’s surface to work out the sequence of base pairs in the molecules. Unlike previous approaches, it’s able to create ultra-long DNA sequences.
That was enough to fill many of the gaps in the genome, but a region known as the centromere that encompasses roughly 3.1 million base pairs of highly repetitive sequences still presented a problem. Fortunately, the team was able to find idiosyncratic sequences that could act as markers to link together multiple long reads to span the entire centromere.
To ensure that the X chromosome was as accurate as possible, the researchers combined nanopore sequencing with results from a further two gold-standard sequencing technologies and approaches for mapping the genome. The result is more than 99.9 percent accurate, which is considered enough to call the sequence finished.
Filling the gaps in the human genome could prove invaluable for biomedical research. “We’re starting to find that some of these regions where there were gaps in the reference sequence are actually among the richest for variation in human populations, so we’ve been missing a lot of information that could be important to understanding human biology and disease,” lead author Karen Miga, from the UC Santa Cruz Genomics Institute, said in a press release.
And while the complete X chromosome is the pièce de resistance of this paper, the team reports that they have applied their approach to the full genome and they’ve managed to reconstruct several other chromosomes. They are aiming to produce a complete human genome by the end of this year.
What secrets lie in store in these hidden parts of the human genome remains to be seen, but uncovering them will be a major step towards humankind truly mastering our own biology.