Quantcast
Membership Signup
Singularity University

Costs of DNA Sequencing Falling Fast – Look At These Graphs!

Cost per Genome

The price of genomes is falling! This is the first step towards a genetically enabled future.

You may know that the cost to sequence a human genome is dropping, but you probably have no idea how fast that price is coming down. The National Human Genome Research Institute, part of the US National Institute of Health, has compiled extensive data on the costs of sequencing DNA over the past decade and used that information to create two truly jaw-dropping graphs. NHGRI’s research shows that not only are sequencing costs plummeting, they are outstripping the exponential curves of Moore’s Law. By a big margin. You have to see this information to really understand the changes that have occurred. Check out the original NHGRI graphs below. With costs falling so quickly we will soon be able to afford to produce a monumental flood of DNA data. The question is, will we know what to do with it once it arrives?

The Costs per Megabase (million base pairs) graph reflects the production costs of generating raw, unassembled sequence data. For The Costs per Genome, NHGRI considered a 3000 Mb genome (i.e. humans) with appropriate levels of redundancy necessary to assemble the long strain in its entirety. Both graphs show some amazing drop offs.

Cost per Megabase

Keep in mind that these graphs use a logarithmic scale on the Y-axis so that steady decline in the beginning of each graph represents accelerating exponential change in the field. From 2001 to 2007, first generation techniques (dideoxy chain termination or ‘Sanger’ sequencing) for sequencing DNA were already following an exponential curve. Starting around January of 2008, however, things go nuts. That’s the time when, according to the NHGRI, a significant part of production switched to second generation techniques. Look how quickly costs plummet. We’ve discussed before how retail costs for genome sequencing are dropping thanks to efforts from companies like Complete Genomics and Illumina. What these graphs show is that those retail prices reflect more than the genius of these companies – it’s a general crash of prices in the industry as a whole. Things are going to continue in this fashion. We’ve already seen newer sequencing technologies start to emerge, and there are institutions all over the world that are dedicated to pursuing genetic information in all its forms.

Just so you know, there’s no sleight of hand here. NHGRI’s explanation of their cost calculations reveal they did a very thorough job, especially when parsing what should be included in production costs and what shouldn’t. You can see all the details on their site.

**UPDATE 3.6.11 To answer questions in the comments section, here are the values used to calculate genome costs:
“The following ‘sequence coverage’ values were used in calculating the cost per genome:
Sanger-based sequencing (average read length=500-600 bases): 6-fold coverage
454 sequencing (average read length=300-400 bases): 10-fold coverage
Illumina and SOLiD sequencing (average read length=50-100 bases): 30-fold coverage”

We have to keep in mind for these calculations that “…the ‘Cost per Genome’ graph was generated using the same underlying data as that used to generate the ‘Cost per Megabase of DNA Sequence’ graph; the former thus reflects an estimate of the cost of sequencing a human-sized genome rather than the actual costs for specific genome-sequencing projects.” (Emphasis mine). We know that companies like Complete Genomics are offering genomes for significantly less than $30k (the estimate in the graph). Yet the general cost for sequencing ‘a genome’ is still averaging around $30k according to the NHGRI estimates. Companies that can beat this price in retail are ‘ahead of the curve’ so to speak (which is part of the reason we like them).**

One of the costs that NHGRI (rightfully) doesn’t include in production is the massive amounts of research investments the industry needs to fund to make any of this DNA data worthwhile. With the falling costs of sequencing we will have enormous amounts of raw data, but still very little understanding of what it means. As Daniel MacArthur (lately of Genetic Futures, now of Wired) points out in his discussion, production is outstripping research. What does the 897324989th base pair of your genome do? If you don’t know, why do you care if it reads A, C, T, or G? As we’ve seen with retail sequencing of single nucleotide polymorphisms (SNPs), giving a customer a look at parts of their DNA can be fun, but it’s not particularly enlightening.

Yet.

With falling genome prices we should be able to perform ever larger studies to correlate genes with medical histories. Already we’ve seen insurance companies and universities assemble large stores of genetic information in preparation for the days when such research will be financially possible. The stage is set for a great revolution in genetics fueled by plummeting sequencing prices. One day soon we should have an understanding of our genomes such that getting everyone sequenced will make medical sense. But that day hasn’t arrived yet. As we look at these amazing graphs we should keep in mind that falling prices are simply the first step in generating the future of medicine that genetics has promised us. The best science is still ahead of us. Get ready for it.

[image credits: NHGRI at NIH]
[citation: Wetterstrand KA. DNA Sequencing Costs: Data from the NHGRI Large-Scale Genome Sequencing Program Available at: http://www.genome.gov/sequencingcosts/. Accessed March 1, 2011]

Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

19 comments

  • Mammago says:

    Whoa, whoa, whoa – I was under the impression that Complete Genomics had brought their prices down to a mere $5,000 ages ago –

    http://singularityhub.com/2009/08/28/get-your-entire-genome-from-complete-genomics-for-5000/

    What’s the deal with these graphs then?

    • Adsaenz says:

      See the update above. Basically, NHGRI was trying to estimate the general cost to sequence ‘a genome’ not the actual reported costs of individual companies. Complete Genomics is ahead of the curve.

  • Daen de Leon says:

    For all the breathless hype about breaking Moore’s Law, we would be well advised to re-read Gordon Moore’s own statement on his eponymous ‘Law':

    Moore’s law has been the name given to everything that changes exponentially. I say, if Gore invented the Internet, I invented the exponential.

    That said, the cost reductions since ’08 have been driven as much by the scaling of production of reagents and consumables, and the increased sensitivity of the instruments – we’re getting to single molecule scales, as demonstrated (almost) reliably by Pacific Biosciences recently. Skipping the expensive amplification steps, and making the runs parallel and more efficient, has been the driver, as duly noted.

    I’m not at all convinced that sequencing every patient’s genome is going to do much for healthcare in the next ten years. Cancer and autoimmune disorders are not static diseases – they derive from a complex mix of differential expression of proteins, localized epigenetic changes and environmental factors. Only one of these (differential protein expression) may be readily discernible from whole genome sequencing, and that is far from established. Instead, I believe that tissue-specific gene and epigenetic expression profiling tests and their detailed analysis will be more important for patient care in future: a combination of molecular cytology, systems biology and bioinformatics/biostatistics.

  • Jason says:

    *** Keep in mind that these graphs use a logarithmic scale on the Y-axis so that steady decline in the beginning of each graph represents ***decelerating*** change in the field. ***

    Each unit on the left represents a smaller amount the further away from the top you are, so the fall in costs of a steady declining line are SLOWING not speeding up.

  • ChristmasPaul says:

    Uhhh, yeah, according to that graph, the current price of a complete human genome sequencing is 10 ^ 4.5, which is around $31,600. I also thought it cost less than this by now. I know it was a tenth of a billion dollars in the beginning but I really thought it was less than $31K by now. What’s going on?

  • L.A. Louizos says:

    Whole genome sequencing is the first step to accumulate raw data. We then need the whole genomics and proteomics computational infrastructure to complete the whole pattern. After that we will transend medicine to a whole new level: That of prevention and personalized medicine.

  • Mike says:

    One of the consequences of this that you hear talked about is that computational costs are becoming a larger portion of DNA sequencing and analysis. Storing the data, piecing it together, analyzing it, etc requires some pretty good computer hardware, and since moore’s law isn’t keeping up with DNA seq that cost is increasing, comparatively.

  • Jamesjoyce says:

    What happened in October 2007?

  • Homer500 says:

    Am I missing something? The two curves should differ by a ratio of 3000 (the number of megabase pairs in a human genome), but the ratio appears to be 30,000. What gives? There’s a factor of 10 creeping in somehow.

    • Adsaenz says:

      Remember that while a genome is roughly 3000 megabases, sequencing a genome requires piecing small segments of sequenced DNA together. To that end, genome sequencers use multiple passes on the bits of DNA. This multiple ‘coverage’ means that each base is read several times to ensure accuracy. 10x coverage is common (though standards vary, and preferences are probably towards higher coverage). So, for a 3000 megabase genome, 30,000 megabases of sequencing is completely reasonable.

  • net says:

    I dont understand how Moore’s law is relevant here. I believe that technology does have a big part in the reduction of the cost, but Moore’s law only applies to the hardware. What about the algorithms.

    http://www.i-programmer.info/news/112-theory/1763-algorithms-beat-moores-law.html

    It would be crazy to believe that technology is the only reason for the price decrease. I agree 100% with Dean de Leon on the point of economics of scale. Increasing production or expanding can dramatically lower the overall cost.

    http://en.wikipedia.org/wiki/Economies_of_scale

  • Rdinsmore says:

    Why put Moore’s law here. DNA sequencing just began, whereas Moore’s law came decades after the invention of the transistor. Another difference is Moore’s law describes fabrication not analytical chemistry.

    • Jan Doeschot says:

      Moore’s law is relevant here because increases in computing power go hand in hand with increases in sequencing speed. This graph however shows that this is not the only thing happening but new ways of sequencing are having an effect that allows sequencing to drastically exceed the doubling time of computer technology.

  • Cellar says:

    My reading of the graph suggests that the next big drop would be, oh, around 2017. ICBW, but it doesn’t seem unreasonable to suggest it takes until then for complete sequencing to cost but a few dollars. HOPEFULLY by then we’ll have sorted the privacy handling since then. Right now it’s fine to open up your genome to all and sundry if you’re reasonably healthy and not averse to a bit of fame, but in a while there might be serious repercussions if not spotlessly healthy or without criminal record. And sorting the privacy thing is important not only because our current approach is laughably inept (“please sir, don’t abuse all that data you’re sitting because I can’t check if you would”) but also because the stakes riding on it go well beyond your personal wellbeing. It’s about not making ourselves slaves to our technology as the bureaucrats too learn to use technology, say to enforce policy. Unless we ensure it will be, it really isn’t friede freude eierkuchen that’s waiting for us in the shining bright future we’re building for ourselves and our children.

    Yes, it really is time to push forward with the zero-knowledge proofs and such, learning how to use them everywhere where previously we’d hand over far too much data just to prove we have it–as is increasingly being demanded by governments the world over, of literally all types.

  • tj says:

    Yup. You haven’t covered the cost of data storage though. Next gen. sequencing generates huge amounts of data that need to be stored and processed before any sequence related meaning can be derived from it. You really should do an article on that.

  • Anonymous says:

    Note the vertical axis is incorrectly labelled: The decade below $1 should be $0.1, not $0. The latter never appears on a logarithmic axis.

Singularity Hub Newsletter

Close