Categories
Latest
Popular
Editor's Picks

Why Big Tech Companies Are Open-Sourcing Their AI Systems

5,835 3 Loading

The world’s biggest technology companies are handing over the keys to their success, making their artificial intelligence systems open-source.

Traditionally, computer users could see the end product of what a piece of software did by, for instance, writing a document in Microsoft Word or playing a video game. But the underlying programming — the source code — was proprietary, kept from public view. Opening source material in computer science is a big deal because the more people that look at code, the more likely it is that bugs and long-term opportunities and risks can be worked out.

Openness is increasingly a big deal in science as well, for similar reasons. The traditional approach to science involves collecting data, analyzing the data and publishing the findings in a paper. As with computer programs, the results were traditionally visible to readers, but the actual sources — the data and often the software that ran the analyses — were not freely available. Making the source available to all has obvious communitarian appeal; the business appeal of open source is less obvious.

Microsoft, Google, Facebook and Amazon have been making remarkable progress developing artificial intelligence systems. Recently they have released much of their work to the public for free use, exploration, adaptation and perhaps improvement.

This seems bizarre: why would companies reveal the methods at the core of their businesses? And what does their embrace of open-source AI say about the current state of artificial intelligence?

Remarkably powerful software

Each technology that’s being revealed displays remarkable capabilities that go well beyond what was possible even just 10 years ago. They center on what is called “deep learning” — an approach that organizes layers of neural networks hierarchically to analyze very large data sets not just in search of simple statistics but also seeking to identify rich and interesting abstract patterns.

Among the technologies that major tech companies have opened recently are:

(Facebook’s director of AI research Yann LeCun demonstrates the power of M.)

To understand what is driving these trends toward open source AI, it is helpful to consider other organizations in the broader social context in which these companies operate.

Even the military is going open source

One useful comparison is DARPA, the research arm of the U.S. Department of Defense. It is hard to imagine an organization likely to be more concerned about others taking advantage of open information. Yet, DARPA has made a big push toward open-source machine learning technologies.

Indeed, the DARPA XDATA program resulted in a catalog of state-of-the-art machine learning, visualization and other technologies that anyone can download, use and modify to build custom AI tools. (I was a research lead on the CrossCat/BayesDB project that was supported through this program.)

The fact that DARPA and the Defense Department are so supportive of open-sourcing strongly indicates that the advantages of open sourcing outweigh the disadvantages of making high-quality tools available to potential adversaries.

Another useful comparison is the OpenAI project, recently announced by tech entrepreneurs Elon Musk and Sam Altman, among others. The effort will study the ethics of creating and releasing machines with increasing abilities to interact with and understand the world.

While these goals will be familiar to anyone who has read Isaac Asimov, they belie a deeper issue: even experts do not understand when or how AI might become powerful enough to cause harm, damage or injury.

Open sourcing of code allows many people to think through the consequences both individually and together. Ideally, that effort will advance software that is increasingly powerful and useful, but also broadly understandable in its mechanisms and their implications.

AI systems involve large — often very very large — amounts of code, so much that it stretches the ability of any single individual to understand in both breadth and depth. Scrutiny, troubleshooting and bug-fixing are especially important in AI, where we are not designing tools to do a specific job (e.g., build a car), but to learn, adapt and make decisions in our stead. The stakes are larger both for the positive and potentially negative outcomes.

Open source AI makes business sense

Neither the motivations of DARPA nor OpenAI explain exactly why these commercial technology companies are open sourcing their AI code. As technology companies, their concerns are more immediate and concrete. After all, if nobody is using their products, then what good are nice clean code and well-intentioned algorithms?

There is a common view within the industry that technology companies like Google, Facebook and Amazon are not in the businesses one might assume. Over the long run, Google and Facebook are not really in the business of selling ads, and Amazon is not in the business of selling merchandise. No, these technology companies are powered by your eyeballs (and data). Their currency is users. Google, for example, gives away email and search for free to draw users to its products; it needs to innovate quickly, producing more and better products to ensure you stay with the company.

These companies open-source their AI software because they wish to be the foundations on which other people innovate. Any entrepreneur who does so successfully can be bought up and easily integrated into the larger parent. AI is central because it, by design, learns and adapts, and even makes decisions. AI is more than a product: it is a product generator. In the near future, AI will not be relegated to serving up images or consumer products, but will be used to identify and capitalize on new opportunities by innovating new products.

Open-sourcing AI serves these companies' broader goals of staying at the cutting edge of technology. In this sense, they are not giving away the keys to their success: they are paving the way to their own future.


Patrick Shafto, Associate Professor of Mathematics and Computer Science, Rutgers University Newark

This article was originally published on The Conversation. Read the original article.

Patrick Shafto

Patrick Shafto

I am Henry Rutgers Term Chair in Data Science and Associate Professor in the Department of Mathematics and Computer Science at Rutgers - Newark. I am also affiliated with the Institute for Data Science, Learning and Applications (I-DSLA) and have appointments in Psychology, Rutgers Business School, and the Center for Molecular and Behavioral Neuroscience (CMBN) at Rutgers.

I lead the CoDaS lab. The goal of my research is to bridge Cognitive Science and Data Science by understanding human perception and cognition and developing more cognitively natural machine learning and data science tools.
Patrick Shafto

Latest posts by Patrick Shafto (see all)

Discussion — 3 Responses

  • John Nasbitt February 25, 2016 on 2:25 am

    There is also very little value in keeping AI data-source “closed” simply because very few organizations apart from the big-tech giants can use and leverage these information. Open sourcing AI also allows the parent companies to keep an eye on how its being used; a far more important commodity than the AI itself.

  • Tober March 2, 2016 on 10:42 am

    There is only 1 correct answer / nothing in between…and IT giants ( like dinosaurs ) can not ” by default ” ( make ) understand intelligence. It is one man job no matter how crazy it sounded. Security measures are most important and the most difficult. https://evolutionofhumanintelligence.wordpress.com/

  • Prasad N R July 7, 2016 on 12:20 pm

    This one is excellent. You are requested to continue publishing such awesome articles. It is because of articles why I open-sourced the entire software of my (unregistered) company too. While it is easy to perceive open-sourcing as an objective of ulterior motives, it is presumably not and this article nicely highlights those important aspects.