What’s going to happen tomorrow? What about next week or next year? Recorded Future says they can tell you. No, they don’t claim to be psychics. In fact, they maintain that any one of us could tell the future – we need only the right algorithms.
The strategy behind Recorded Future’s techno-soothsaying is different from software that can pinpoint with certain probability the location of a future crime, for example. Rather than read the “signs” and make an educated guess, their algorithm sifts through the multitude of online sources of world events – news sites, blogs, social media – and extracts concrete information about who will do what when, and what will happen where.
The information is intended to serve a practical purpose. You’re a busy CEO with a company to run and don’t have the time to sift through 70,000 web sources. Recorded Future will do the legwork for you. And their system can be configured to detect the reported activity of specific companies, fine-tuned with keywords, in the coming days, months, or years.
Their 70,000 sources on the web range from big media and government web sites to individual blogs, and social media such as selected Twitter streams. From those sources they’ve built a database of millions of entities and events and more than 2 billion facts that they think people will want to know.
Similar to Google Knowledge, Recorded Future takes unstructured text and puts it in context of the real world, drawing relationships between unassociated text. Their system would link Albert Einstein with Ulm, Germany, his birthplace, and relativity theory, for example. Temporal analysis puts times to events, such as an absolute “September 11, 2001,” or a relative “two weeks from now.” If a news text read, “Barack Obama said yesterday that Hillary Clinton will be traveling to Haiti next week,” Recorded Future treats the statement as two events: a quotation event in the past and a travel event to occur in the future. It then maps those events to actual calendar dates, turning imprecise text into a precise timeline. It even compensates for cultural ambiguities, such as the “first day of the week” being Sunday in the US, but Monday according to the International Organization for Standardization.
A metric scores the “momentum” of people or events in the news based on how many times they’re mentioned on the web, gauging the likelihood that something’s actually going to happen. Also scored is the “sentiment” of events, or the attitude of the report’s author. The following graph shows the spike in negative sentiment per day surrounding “Muammar al-Gaddafi” at the beginning of the Libyan crisis in 2011. It eventually tapers off as the media’s focus on the country decreases.
According to the company, their three-dimensional strategy of structure, time, and momentum allows them to answer questions like “Which heads of state visited Libya in 2010?”, “What pharma companies are releasing new products in the first quarter of 2012?”, and “What do French bloggers say about the earthquake in Haiti?”.
Even though it’s true that anyone could find the answers to these questions given enough time and effort – the whole point of Recorded Future is empowering people through convenience and efficiency.
Here’s a short video that summarizes Recorded Future’s strategy.
It all sounds good in principle, but a closer look reveals that Recorded Future’s algorithms could stand some tweaking. An old timeline I looked at tracking references to Facebook showed a sharp dip in sentiment. The “negativity” came from an article about the Keller, Texas school district that shot down a proposed tax hike. Lots of negative sentiment in Keller about the proposal that one group raised $10,000 to stop the hike – through Facebook.
That would be, actually, the only connection between Facebook and Keller, Texas described in the article.
Search algorithms have some room for improvement, as we are often reminded through Google Search. But Search is still a powerful tool. The following is an example of how Recorded Future might be used.
The next two images are taken from Recorded Future’s tracking of cyber threats. Below is a source map of the many companies and organizations, places, persons, and products the system drew its information from. Certainly more sources than I’m used to perusing on a daily basis.
The following shows the near future of hacking-related activities. One key event the system detected from the conservative site Human Events where it was rumored that a powerful virus could wipe out hundreds of thousands of computers on July 9.
At first glance the timeline doesn’t look like there’s much activity at all, but taken into the larger context of the first half of 2012, it does indeed appear to be a spike in activity. And, as cyber attacks are often politically motivated, the near future of reported “Hot Button Sociopolitical Events” can be plotted. And then, as Recorded Future writes, “It leaves the analyst to then ask questions about their organization’s apparent ties or support to contentious causes that might make them a target on these dates.”
One confounding variable in telling the future, however, could be Recorded Future itself. It’s very possible that a lot of companies are already using their own Recorded Future-type algorithms to get a beat on the competition. If all companies were to use the same software at what point does the system become less of a tool to predict, and more of a tool to promote? If several investment firms sell shares of a particular company because share prices are rumored to drop significantly, for instance.
The amount of information on the web is staggering and we need tools to access the information relevant to us. Google became the giant that it is because of its ability to connect people to the information they seek. The challenge now is to find efficient ways to extract information that answers specific questions. Will Recorded Future prove useful enough for companies to want a gaze into its crystal ball?
“Ask again later.”
[image credits: Recorded Future and Gizmag]