How Will Artificial Intelligence Impact Open Technologies?

U.S. Patent No. 9, 340, 178 B1

Mozilla has long been one of the strongest champions for openness in technology — from the software it produces to the web standards it adopts. As new technologies emerge, the battle over closed versus open systems continues to be one of the most important factors for a range of concerns that are critical to a healthy information ecosystem — innovation, competition, privacy, security, consumer protection — and even civil rights.

With new advances in artificial intelligence — particularly in the fields of machine learning and sensor technology — questions of “open” versus “closed” have arisen again. However, what is becoming quickly clear is that the traditional open strategies, such as permissive licensing and code/documentation publication, may not work as well or even at all.

Consider deep learning, one of the key AI techniques driving advances such as automated speech recognition. In addition to code, algorithms, and data sets, this machine learning method requires immense computational power to discern complex patterns and data representations. But what does it mean to have “open” computational power in this context? Is it something open innovation communities can generate at sufficient capacity to compete with AI giants such as Facebook, IBM, Google, Microsoft, Apple, Amazon, and Baidu? Projects such as SETI@Home and other crowdsourcing efforts have historically tried to address these concerns but it is unclear whether similar efforts could keep up with the computational acceleration these firms are relentlessly pursuing.

Or consider the impact of open vs. closed Internet of Things (“IoT”) on AI. As autonomous network agents proliferate within our phones, refrigerators, health care devices, cars, clothes, mattresses, and even children’s toys, these objects are quickly becoming one of the most powerful vectors for data used in AI systems. Yet as consumers, we have little to no access to how they work, what they collect, whom they share it with, or even when they are recording our every sound, movement, or decision. And as we have learned with Android OS phones, simply having access to the source code often doesn’t provide the necessary level of transparency, security, or accountability we demand from open source software, especially when so much of it happens on the server side of the equation. Even building on top of a closed platform has become more legally complicated in light of the recent battles over the copyrightability of APIs.

Moreover, whatever knowledge or intelligence is derived by AI-driven IoT from our data will be fiercely guarded by most ML/AI providers. As the New York Police Department recently discovered in their ongoing dispute with Palantir over who owns New York City’s crime data analytics, companies love to use our data for training algorithmic systems, but strongly resist sharing back the results. This means that the more AI systems learn about us, the harder it will be to understand what it is they know or to take that knowledge to a competing AI service if we want to switch to new more open devices. This can obfuscate and even accentuate some of the troubling bias problems that have been reported.

Geographic Distribution of AI Patents

Pile on top of this the problem of patents. Open source software has largely been able to avoid or route around most patent problems, but with IoT and AI in the mix, we should expect a whole new round of patent ‘land grabs’ that are both expensive and difficult to defend against. And unlike most software and web technologies, where specific proprietary layers can be replaced when open versions are available, AI technologies are intensely integrated and complex and generally lack the discernible or distinct components necessary to implement such an approach as a long term solution. They also represent part of a larger trend within the broader Internet and tech industry where growing centralization and locked-down vertical integration are becoming increasingly dominant business models that make switching to new products or services more and more difficult.

So what is to be done about it all? While efforts to provide “open” code, algorithms and training data are laudable, the computation, competition, and accountability/audit concerns are unlikely to be answered with standard open source approaches. Instead, we will need new methods both to measure the negative impacts of AI closure and to ‘fork’ alternatives in meaningful technological, economic, and social ways. Some preliminary work on the IoT aspects of these problems are underway at Mozilla’s Open IoT Studio. And new initiatives such as FATML and NYU’s AI Now Institute (where I am a research lead) should provide much needed measurement tools and empirical research to understand these concerns in greater depth and detail.

Fundamentally, however, it will be incumbent on Mozilla and other open technology advocates to press on these issues, especially in America, China, Canada, Australia, and Europe, where the greatest investments in AI are happening. As part of my Mozilla Tech Policy Fellowship, I’ll be looking into these questions and possible interventions to help fight to keep AI and IoT as parts of the open technology ecosystem.