Community Focused Applied Machine Learning
How My Past and Present Collided at Applied Machine Learning Days in Lausanne, Switzerland
In late January I was very fortunate to attend and contribute to a conference in Lausanne, Switzerland called Applied Machine Learning Days. I am privileged that part of my job as a Developer Advocate at IBM’s Center for Open-Source Data and AI Technologies is to seek out conferences and communities where I can share what I’m working on at IBM. I first discovered this conference when I was searching online for machine learning related conferences. I was hoping to present on the open source Python library Adversarial Robustness Toolbox (ART) which is a toolbox for rapid crafting and analysis of attack and defense methods for machine learning models. The conference attracted my attention because it prominently featured a keynote by Garry Kasparov, the former chess world champion from Russia who is widely considered to be the best chess player of all time. I grew up in Russia and this made it seem meant to be.
An International Collaboration
I submitted an abstract prepared with my colleague Animesh Singh, and it got accepted! I was very happy, and then I noticed that a call for posters was still open. One of my favorite topics is open standards for predictive model deployment, PMML, PFA, and ONNX. I wrote about PMML in an earlier blog. Since I knew that Ludovic Claude, a very active contributor to open source code related to PFA, lived in Lausanne I suggested to him that we create a joint poster on PMML and PFA and he agreed. This conference is devoted to applications of machine learning, and Ludovic is actively using PFA in his work with medical applications at CHUV, a network of research hospitals, just what was needed for the conference! We decided to also invite my colleague Nick Pentreath who is well known to PFA fans as the creator of PFA export package for SparkML called Aardpfark. Nick lives in Cape Town, South Africa, Ludovic in Lausanne, Switzerland, and I am in Chicago, IL, USA, so it was truly an international collaboration!
My Favorite Things about the Applied ML Days Conference
The conference was held in Swiss Tech Convention Center, right on campus of EPFL, a famous Technological Institute, one of the best in Europe. It is very close to the platform of the metro station. Lausanne has great public transportation!
I took a few different workshops while there at the conference and wanted to share a few things I have learned.
The first workshop I attended was “Hands-on deep learning with TensorFlow.js”. I heard about TensorFlow.js from my colleagues and was curious to learn more about it. The workshop was taught by Frederic Ouwehand and Harry Anderson. They prepared a GitHub repository with all the necessary code and explanations. The participants were to perform data wrangling and train a model using TensorFlow, then export the model to TensorFlow.js and deploy it in the browser. I learned that TensorFlow.js could use GPU by WebGL, as the server-side (node.js) code binds to TensorFlow C binaries that provide better performance. It is also possible to run the code on a Raspberry Pi, which is often used in robotics or IOT applications. Using Colaboratory, free Jupyter notebook environment, made the whole process easy.
On Sunday morning I went to the workshop on open data organized by Oleg Lavrovskiy, as the topic seemed fitting to my group at IBM — Center for Open-Source Data and AI Technologies (CODAIT). Very few people attended that one, so we had a pretty casual atmosphere. Oleg explained the Frictionless Data system that he designed for standardizing open data documentation. I told him and other participants about the open source project Egeria that IBM’s Distinguished Engineer Mandy Chessel and her team created. Egeria is interesting because it provides the Apache 2.0 licensed open metadata and governance type system, frameworks, APIs, event payloads and interchange protocols to enable tools, engines and platforms to exchange metadata in order to get the best value from data whilst ensuring it is properly governed.
In the afternoon on Sunday I went to a very interesting workshop “Artificial Curiosity: Intrinsic Motivation in Machines Too!”. I learned a lot about reinforcement learning which is a really neat branch of machine learning. Basically, reinforcement learning is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. Reinforcement learning is considered as one of three machine learning paradigms, alongside supervised learning and unsupervised learning. At the workshop we learned how deep learning is used in reinforcement learning and how intrinsic motivation helps to make progress in areas where traditional reinforcement learning was not doing so well.
The Poster We Presented at the Conference
You can view the poster we presented at the conference, it is titled “Open standards for deployment, storage, and sharing of predictive models. PMML/PFA/ONNX in action”. Many people going through traditional machine learning (ML) classes may think that building a good model for their data is the end of the project. Model deployment, the step in machine learning where you make the model available for computing predictions for new data, is actually one of the most important parts of the ML process. This is the part where the benefits of ML can be realized. And often this is not easy due to the different teams and environments used for model building and deployment, and the need to keep all the data preparation steps along with the model adds another challenge. Open standards for predictive model exchange and deployment are designed to help resolve this problem.
Predictive Model Markup Language (PMML) created by the Data Mining Group (DMG) was the first such standard, originally released around 1997, based on XML. The DMG is a group of companies working together on open standards for predictive model deployment. Both IBM and SPSS were among the founding members of the group and the most active members (IBM acquired SPSS in 2009, and I worked at SPSS since 2000). The latest release PMML 4.3 contains 16 models, ways to combine the models into ensembles or compositions, descriptions of the input data, data transformations, as well as data and model statistics. It is used by over 30 companies and open source packages. I am fortunate that I’ve been working with the DMG since 2001. PMML makes it very easy to build a model in one commercial system or open source package, then save it in PMML and deploy into a totally different system, as long as both are PMML-compliant.
PMML is great except when one of your models or important features is not yet supported by the standard. Bringing new features into PMML can take a very long time.
Several years ago a new standard was created by the DMG. It is called Portable Format for Analytics (PFA), is JSON-based, and uses a different approach to model representation, to avoid the problems found with PMML. Instead of describing the models and transformations, PFA provides the building blocks to describe the scoring procedure in its own mini-language. This standard is growing in popularity and attracts a lot of interest. Ludovic is using PFA to collect models from the different hospitals where they are built on local data (the data cannot be moved due to privacy concerns, can only be used locally), to evaluate and combine those models, and to use the models on new patients. This is the diagram he created for the poster.
Unfortunately, both PMML and PFA do not yet support deep learning models, which are now used everywhere. We are working on adding support, but it is not easy, given how large the models can be and the wide variety of model building blocks. A paper written by a Stanford graduate student Max Ferguson working with a group of DMG participants (including myself) and Stanford professor Kincho Law containing a proposal for a new PMML model supporting deep learning has been accepted to International Design Engineering Technical Conferences & Computers and Information in Engineering Conference (IDETC/CIE 2019) to be held in August in Anaheim, California.
Microsoft and Facebook came up with Open Neural Network eXchange format — ONNX (pronounced “onyx”) in September 2017. Most popular deep learning frameworks now have support for it, with IBM team actively working on converters for TensorFlow, as well as now leading the working group on ONNX Training. Our poster included all three of those open standards.
Many people expressed interest in our poster’s contents, including Bastiaan Quast and Fred Werner who work on IT for the United Nations. They were looking for open standards for the predictive models deployed for various needs at the UN.
Adversarial Robustness Toolbox
After the poster sessions were done, it was time for me to present my talk on the Adversarial Robustness Toolbox. This is an open source library dedicated to adversarial machine learning.
In recent years we have seen amazing progress in the ability of deep learning models to recognize images, sounds, speech, etc. However, the mechanisms by which the models learn may be very different from how humans recognize images, sounds, etc. Researchers have found examples where very small but deliberate perturbations of images that are not even visible to a human, can confuse a deep learning network that seemed to perform well on the regular training and testing data.
For example, the left part of the figure above shows a picture of a giant panda that is correctly recognized by a typical convolutional neural network with high confidence (91%), but after a small amount of specially created “noise” is added, the prediction is changed to be “capuchin” (a small monkey) with very high confidence. Searching for more such examples and for ways to make models immune to such attacks resulted in the new field of study: “adversarial learning”. Creating ways to “fool” the models is called “adversarial attacks”. As with conventional weaponry, when the attack tools are improved, people also work on improving the defense tools (or creating new ones). This process is ongoing. At one point it seemed a universal defense tool (Defensive distillation) was found, it could protect from all known attacks at the time, but then three new kinds of attacks were found by Nicholas Carlini and David Wagner, so the process continued.
The attacks can be targeted or not targeted, “white box” or “black box”. Targeted attacks modify the input in such a way as to make the model give a specific incorrect prediction, while non-targeted just give any incorrect prediction. The paper by Carlini and Wagner shows how several images can be slightly modified to generate any other prediction illustrating targeted attacks, see the figure below.
A “white box” attack is using full access to the model used in classification to create the perturbed image, while the “black box” one does not have access to the model. It was discovered that if one deep learning model misclassifies a perturbed image to give an incorrect prediction, then another deep learning model, possibly with a very different architecture, but built on the same data, would most likely give the same incorrect prediction on that input. This property is called “transferability” and can be used in “black box” attacks, as long as the perpetrator has access to the original data to create another model. Additionally, there can be “poisoning” attacks, where the training data (collected using crowdsourcing) is intentionally mislabeled to cause misclassification of certain inputs (such as classifying Stop signs with stickers as a totally different road sign).
So how do you defend from such attacks? As mentioned, a number of defense methods were found. They use varying approaches, including using specially perturbed inputs in addition to the original ones in the model building process and smoothing or reducing the precision of the inputs during inference.
The Adversarial Robustness Toolbox is an open source Python library developed by IBM Research team led by Irina Nicolae and Mathiue Sinn in Ireland. This library supports several popular deep learning frameworks and provides building blocks for many attack and defense methods, as well as ways to detect poisoning attacks and evaluate a model’s robustness to adversarial attacks.
The researchers also created an easy to use web-based demo, where one can play with three possible adversarial attack methods (including the Carlini-Wagner one denoted by C&W) and three possible defense methods, and see the corresponding image changes, classification results, and Python code used for the attack and defense.
The work is still in progress, as more and more new algorithms are published and need to be added. Several new algorithms have already been added since my presentation.
What’s Next For This Work?
I found the conference and my trip to Switzerland to be amazing. Next year the conference will be at EPFL again, and for the future the organizers are looking for other locations. If you know a University that wants to host it, please tell them to get in touch with AMLD organizers.
Animesh and I plan to present on ART again at Spark+AI Summit in San Francisco on April 25 and at Abstractions II conference in Pittsburgh, PA on August 21–23. We will also include it into our 3-hour tutorial at OSCON on July 16 in Portland, Oregon. Come and meet us in person at one or more of these events! In addition, I will be co-presenting Computer Technology Workshop “Introduction to deep learning and Watson Studio” at Joint Statistical Meeting in Denver, CO, on July 31, and this workshop will include ART and ONNX along with several other topics.