A new type of software has been shown to predict revolutions by mining news reports around the world. Retrospectively mining the news for the past 30 years the software indicates points at which the likelihood for a revolution is high. When put to the test – bingo! – the software showed spikes just before the recent Egyptian and Libyan upheavals. It was also able to sift through world news to retrospectively pinpoint Osama Bin Ladin’s location to within 200 km. In the emerging science of ‘culturomics’ that tracks cultural trends through the written word, the software was the first to demonstrate that news coverage can be used to predict future events.
The software sifted through news reports from nearly every country in the world. Major sources were from the global news databases such the US government-run Open Source Center which provides foreign open source intelligence, Britain’s equivalent BBC Monitoring, as well as the New York Times’ archive that dates back to 1945. In all, the body of data included over 100 million news articles. The story elements were woven together into a mind-boggling web of 100 trillion relationships. To crunch the massive amount of data the SGI Altix supercomputer Nautilus was enlisted. Its 1024 Intel Nehalem cores give it a total processing power of 8.2 teraflops (trillions of floating point operations per second).
The strategy for pulling out relevant information included two main techniques. ‘Sentiment mining’ involves counting the number of words in a document that are categorized as positive, such as “good” or “nice,” or negative, such as “terrible” or “horrific.” Changes in the tone of a regions’ news documents over time correlates with the sentiment of the people in that region. A major dip in tone, as when the frequency of negative terms rapidly increases, means the natives are getting restless. An unprecedented dip could spell revolution. The second technique, called ‘full-text geocoding,’ matches sentiment to geographic locations.
The software was developed by Kalev Leetaru at the University of Illinois’ Institute for Computing in the Humanities, Arts and Social Science. Looking specifically at news about Egypt, Libya and Tunisia, Leetaru’s software showed compelling trends of negative tone in the decade leading up to the recent revolutions we’ve come to call the Arab Spring.
In Egypt, for instance, widespread protests that began on January 25, 2011 led to President Mubarak’s resignation on February 11. Tone monitoring was performed on 52,438 articles worldwide between January 1979 and March 2011 that contained any mention of an Egyptian city. The software selected for Egyptian cities rather than the word “Egypt” to filter out articles that only casually mentioned Egypt the way a travel guide might do. Between January 1 and January 24 of 2011, global tone about Egypt dropped to an extent that had only been seen twice in the past 30 years. The previous times were January 1991 when US planes bombarded Iraqi troops in Kuwait and the March 2003 US invasion of Iraq. According to Leetaru, the constant speculation of pundits in the days before the uprising was simply wasted breath. “Despite being hailed as a social media revolution,” he writes in the report, “monitoring the tone of only mainstream media around the world would have been enough to suggest the potential for unrest in Egypt.” He doesn’t assert that a threshold for global tone exists beyond which revolution becomes inevitable. Nor is the software powerful enough to predict progression or timing of events. Rather, tone tracking will reveal periods of “increased potential for unrest” that policy-makers could use to cut through the speculation. It’s an assessment of trends over long periods of time that are harder to spot by subjective and likely biased monitoring by governments and think tanks.
“The mere fact that the US President stood in support of Mubarak suggests very strongly that even the highest level analysis suggested that Mubarak was going to stay there,” Leetaru told the BBC News. “That is likely because you have these area experts who have been studying Egypt for 30 years, and in 30 years nothing has happened to Mubarak. If you look at this tonal curve it would tell you the world is darkening so fast and so strongly against him that it doesn’t seem possible he could survive.”
Tonal analyses pointed to similar darkenings for the leaders of Tunisia, Libya, and for the ethnic conflicts in the Balkans during the 1990s that Leetaru points out “caught many by surprise.”
Global news can also give us clues to locate individuals of interest. The software analyzed documents that contained the words “bin ladin” between 1979 and April 2011, and mapped associated locations in those documents. Many would have put their money on Bin Ladin being in Afghanistan. Tallying up locations associated with the elusive Al Qaeda leader shows that only 28 percent of news reports had him in Afghanistan while about half of them speculated he was in Pakistan. Abbottabad, which was mentioned only once following Bin Ladin’s discovery there, is less than 200 kilometers from the two cities most associated with him. Again, Leetaru acknowledges the limitations of the software. “While far from a definitive lock in Bin Ladin’s location,” he writes in the report, “global news content would have suggested Northern Pakistan in a 200 km radius around Islamabad and Peshawar as his most likely location, and that he was nearly twice as likely to be making his residence in Pakistan as Afghanistan.”
The current research extends the young field of ‘culturomics’ which attempts to quantify human culture “across societies and across centuries.” In collaboration with Google books, study released last year scanned a collection of digitized texts that represented 4 percent of all books ever printed. They quantified cultural trends reflected in the English language across diverse topics such as the “evolution of grammar, collective memory, the adoption of technology [and] the pursuit of fame.” The report acknowledges that this kind of high-throughput, computer-based tone monitoring is not as accurate as when tone is gauged by humans. In other efforts to predict the future through tracking trends, police in Los Angeles are currently experimenting with algorithms that predict crimes before they happen.
It’s one thing, however, to plot global tone and ask retrospectively if trends are pointed towards events that we know have already happened. It still remains to be seen whether Leetaru’s software can track events today to predict events of tomorrow. “It is obviously much easier to find precursory signs when you know where to look than to do it blindly,” Thomas Chadefaux, a political scientist at the Swiss Federal Institute of Technology said in an interview with the journal Nature. He also called Leetaru’s work “a welcome addition to a field – political science – that has cared very little about finding early warning signals for war, or making predictions at all.”