I just saw a story about the NSA gearing up a datacenter to potentially hold a yottabyte of surveillance data. The whole surveillance angle itself is pretty interesting, but what caught my attention was the concept of the yottabyte.   The yottabyte is 1024 bytes. That is three levels above the petabyte, which itself is a million gigabytes. If that doesn’t make much sense to you, here is a chart from wikipedia that might help:

yottabyte

As little as 10 years ago the petabyte seemed just as large and amazing as the yottabyte sounds today. Now it is common for companies such as Google or Facebook to hold several petabytes of information in just one of their many datacenters across the world.

In one sense the yottabyte is no big deal. I mean, its just another rung on the ladder of names we give to the ever increasing amounts of data around us, each rung growing by an order of 1,000 starting with the kilobyte. Ten years from now the yottabyte will go the way of the petabyte as the term that once represented enormous amounts of information but now is just another drop in the bucket.  It will be replaced by a new term that represents 1 million or even 1 billion yottabytes.  Nevertheless, it is fun and also illustrative to spend a moment to ponder the yottabyte, which is indeed an enormous number by today’s standards.  And let us not ignore the fact that yottabyte when pronounced with a long oh sounds like “yoda byte” – now that’s awesome!

bytes-bits
Attack of the bits!

Lets first build up to the yottabyte in today’s standards. A one page Microsoft Word document is anywhere from 50 to 100 kilobytes (KB). A picture from your camera is typically anywhere from 0.5 to 3 megabytes (MB) and a song you might download from Itunes is usually about 3 or 4 megabytes (MB). Moving up the ladder, popular consumer devices such as iphones, ipods, and digital cameras hold anywhere from 1 to 100 gigabytes (GB) of storage capacity. The top of the line hard drives that you can buy at Best Buy are now 2 terabytes. Assuming Google has 1 million servers each with 1 terabyte of storage (a size Google has already reached or likely will reach in the next year) we can estimate that Google has leap frogged the petabyte and now boasts a total worldwide storage capacity of roughly 1 exabyte. The storage capacity of the approximately 1 billion personal computers worldwide in 2009 with an average storage capacity of perhaps 250 gigabytes (GB) of storage each is not even half a zettabyte!

A yottabyte would equal about 1 million times the current storage capacity of Google, or about 2000 times the storage capacity of every personal computer in the world in 2009.  Here are some other fun stats I have gathered from across the web, with sources:

In 2006, the amount of digital information created,
captured, and replicated was 1,288 x 1018 bits. In computer
parlance, that’s 161 exabytes or 161 billion gigabytes (see
sidebar). This is about 3 million times the information in all
the books ever written.
• Between 2006 and 2010, the information added annually to
the digital universe will increase more than six fold from 161
exabytes to 988 exabytes.
  • In 2006, the amount of digital information created, captured, and replicated [in the world] was 1,288 x 1018 bits. In computer parlance, that’s 161 exabytes or 161 billion gigabytes. This is about 3 million times the information in all the books ever written. (source)
  • Between 2006 and 2010, the information added annually to the digital universe will increase more than six fold from 161 exabytes to 988 exabytes. (source)
  • All hard disk capacity developed in 1995 – 20 Petabytes (source)
  • All printed material in the world – 200 Petabytes (source)
  • All words ever spoken by human beings – 500 Petabytes (source)

We must keep in mind of course that total storage capacity does not equal the total amount of information.  Google might have an exabyte of storage capacity, but perhaps only 70% or less of this capacity is even being used.  Of the capacity that is used, a majority of it is simply used to replicate data perhaps 3 or even 6 times over as backup in case of data loss.  This implies that as little as 10%, or 100 petabytes worth of Google’s exabyte capacity, is being used for real, unique information.  All of these numbers are rough estimates, but you get the idea.

The exponentially increasing amount of information being created and stored around the world is one of the central phenomenon upon which our technological progress and future is based. And no matter how you look at it, the total amount of information being stored and processed across the world is fascinating. Yet in spite of the impressive amount of information being digitally created and stored each day by the human race, this is still nothing compared to the total information content of the world and the universe at large that could be recorded.  Today a large portion of the digital information that is being created comes from individuals, corporations, and governments that are taking photos and videos.  As lifelogging and 24/7 surveillance become ever present in our lives the amount of photos and videos being recorded and archived is set to absolutely explode.  Imagine the amount of data that will be collected over the next decade or two when every single individual on the planet is equipped with a device that will record every minute of their lives 24/7.  Indeed, when that day arrives the yottabyte will be just a small drop in an enormous sea of life streams, experiences, and knowledge of mankind encoded into bits.

Singularity Hub chronicles technological progress by highlighting the breakthroughs and issues shaping the future as well as supporting a global community of smart, passionate, action-oriented people who want to change the world.