Continual Distributed Homomorphic Learning: the Future of AI?
Towards the concept of Fluid Intelligence
In the last decade we have witnessed a tremendous progress in Artificial Intelligence (AI) due to recent Machine Learning (ML) advances. Deep Learning (DL) models and techniques have enabled a major leap in automation complexity, enabling a large set of new applications involving high-dimensional perceptual data, which were even unthinkable to tackle a couple of decades ago.
Machine Learning, I think for the best, seems to be the forefront runner in the long path towards strong AI.
So, what’s next? What to expect for the future of Machine Learning and AI?
While Medium is not the place to dwell in comprehensive and articulated considerations about the future of AI, in this brief Medium post, I speculate about the possible intersection between three fundamental research areas: Continual Learning, Distributed Learning and Homomorphic Encryption and what effects their combination would entail for the next-generation of AI systems.
Continual Learning (CL) is built on the idea of learning continuously and adaptively about the external world and enabling the autonomous incremental development of ever more complex skills and knowledge.
From a more practical point of view, it means being able to smoothly update our prediction models to take into account different tasks or data distributions but still being able to re-use and retain useful knowledge and skills previously acquired.
Hence, CL is the only paradigm which force us to deal with an higher and realistic time-scale where data (and tasks) become available only during time, we have no access to previous data and it’s imperative to build on top of previously learned knowledge.
As I discussed in depth in my previous blog post “Why Continual Learning is the key towards Machine Intelligence”, being able to learn continuously from a never-ending stream of data (like our biological counterparts), may be the key for endowing our artificial learning systems with three extremely important properties of every intelligent agent: adaptation, scalability and autonomy.
Learning over multiple computational nodes has always been a common practice in machine learning for speeding up the training of our algorithms, distributing computations over multiple CPUs/GPUs in a single or several machines.
Current Data Parallelism approach generally assumes the efficient data forwarding across nodes or the availability of the same data in each computational node, dynamically splitting the training workload over multiple batches.
On the other hand, the not very popular Model Parallelism, allows to split a massive computational graph, not the data, across multiple computational nodes if the learning procedure cannot be accomplished within a single one (for some memory or computational constraints).
However, a recent trend in Deep Learning named Federated Learning, focuses on the idea of reducing the communication bandwidth when learning, especially when data are never present together on the same computational node and are never forwarded from one node to the other.
Finally Knowledge distillation is another interesting area of study concerned with the idea of distilling and instilling knowledge from one model to another. Knowledge distillation is particularly interesting for distributed learning since it opens the door to a completely asynchronous and autonomous way of learning, only later fusing all the knowledge acquired in different computational nodes.
Homomorphic encryption is a form of encryption that allows computation on ciphertexts, generating an encrypted result which, when decrypted, matches the result of the operations as if they had been performed on the plaintext.
Differential privacy, instead, is a statistical technique that historically aims to provide means to maximize the accuracy of queries from statistical databases while hopefully minimizing the leak of privacy for individuals whose information is in the database. Differential privacy has been also applied to Deep Learning with various degrees of success.
Homomorphic encryption and differential privacy if used together can allow secure and privacy preserving learning over personal data making them impossible to openly access and limiting the amount of privacy leaked through learning.
The Concept of Fluid Intelligence
All these areas, despite some affinities and commonly used techniques, have been pretty much studied in isolation. It is not very hard to envision that every one of them will have a future in the next wave of AI-enabled systems. So, what if we put everything together?
I think that their association will enable the emergence of what I call here the concept of Fluid Intelligence:
Fluid Intelligence is the result of a decentralized, dynamic learning system composed of multiple AI agents, operating asynchronously and autonomously in the world, always improving and fluidly exchanging knowledge among each other in a secure, privacy preserving way.
Continual learning and distributed learning techniques in this vision allow each AI agent to continually learn from the always-changing external world or to integrate knowledge and skills distilled from other AI agents (matrix style! 😎). On the other hand, homomorphic encryption and differential privacy may make this process secure and privacy enabling.
Similarly to the World Wide Web revolution, the interplay of these technologies and the fluid interactions and knowledge exchanges will dramatically change how we think about the digital world. I argue that information and data exchange will become less common than knowledge and skills distillation. After all, why would you need to recover data if someone else already has the right knowledge to solve the specific task at hand?
While the Web has enabled the recovery of data and information at a distance of a click, as humans we waste a lot of time learning over and over again the same concepts and skills most likely someone else in the world has already acquired.
Will be the future of AI be a lot more about knowledge distillation and instillation rather than learning from scratch? I speculate that this would be the case, essentially skipping the human bottleneck of transferring knowledge and skills though a raw data communication channel that needs to be re-elaborated.
Security and privacy, despite being nowadays pretty much ignored in machine learning applications (with big IT companies accumulating personal data on their servers), will become central issues when talking about Intelligence. As the world takes its first step towards an ecosystem of AI agents fluidly and efficiently exchanging knowledge among each other, let’s all make sure it does it in a decentralized, secure and private way.
If you’d like to see more posts on AI and Continual Learning follow me on Medium or join ContinualAI.org: an open community of more than 350 researchers working together on this fascinating topic! (Join us on slack today! 😄 🍻)
If you want to get in touch, visit my website vincenzolomonaco.com or leave a comment below! 😃