My Takeaways from Machine Learning Summer School 2018 Madrid
Machine Learning Summer School (MLSS) 2018 was held in Universidad Autonomous de Madrid, Spain. I was very fortunate to be among the 150 attendees from 30 countries. Among the audience, are Ph.D. students, researchers, and practitioners from various related domains, including Computer Vision, Natural Language Processing, Artificial Intelligence, etc.
The original purpose of MLSS series was to supplement the lack of post-graduate level courses in Machine Learning in universities. Even today, when learning materials (e.g., books, publications, MOOCs, tech blogs, etc.) are plenty, MLSS is still one of the most credible channels for ML education.
Machine learning summer schools present topics which are at the core of modern Machine Learning, from fundamentals to state-of-the-art practice. — mlss.cc
What are the differences?
- I have taken quite a number of online courses over the past few years, some are not free. But I have never been able to be focus like in MLSS. In the class, the collective level of concentration and attentiveness among the classmates also create a perfect atmosphere for effective learning.
- The course was carefully designed to provide a structured overview of various topics in Machine Learning, including the highly expected deep learning, neural language models, GAN etc.; also covers the relatively low performed but promising areas like Gaussian process, Causal Inference, Baysessin Nonparametrics etc. Meanwhile, all speakers are among the most renowned researchers in their respective domain.
- The opportunity of having face-to-face with the experts, having discussions with the fellow attendees, are incomparable. The speakers are incredibly approachable, answering questions with the best of their knowledge, even are willing to share their thoughts and open to collaboration on potential research ideas. Chances for interacting with fellow classmates are plenty too. As the attendees are from various backgrounds in different stages of their careers, I was impressed to hear the interesting applications, projects, and experiences of theirs.
It’s impossible to share many details of 60 hours of lectures in one blog post, I will summary three new concepts that I learned and the interesting industry-relevant discussions. Interested readers are welcomed to check out the slides. The recordings will soon be available too, hopefully.
The first speaker, DeepMind researcher Shakir Mohamed (also an MLSS alumni), paved the way for the subsequent lectures, by “planting the seeds for probabilistic thinking” and introducing a simple framework to approach any ML problem, the Model-Inference-Algorithm framework. Specifically, models are used to describe the relations between variables (e.g. directed vs undirected, parametric vs non-parametric etc); while statistical inference (learning principle) is the choice of techniques for direct (e.g. Maximum Likelihood, Maximum a posteriori etc) or indirect (e.g. Approximated Bayesian, Max Mean Discrepency etc) estimation of model parameters; lastly, a given model and a learning principle can be implemented in different ways, thus the algorithm. Full slides of Shakir’s talk can be found here.
Causal inference for decision support
Suchi Saria talked about ML and Causal Inference for Reliable Decision Support. The topic is unexpectedly interesting and important for any practical ML applications. For example (from this paper), in a supervised prediction problem in medical context, the input are clinical readings of a patient and the goal is to predict if he/she will have Adverse Event (AE) Onset after 48 hours. It’s seemingly straight forward and tempting to build a deep learning model to map the input to labels. However, the main issue with such model is that the label used for training are influenced by any treatment given to the patient during that 48 hours period, which is unobserved in the data.
It is important to be aware of any unobserved variable that present in the model and try to remove the dependencies to observed ones. She went on to introduce some practical ways to do this. (slides)
Professor Sebastian Nowozin introduced Generative Adversial Network from the simplest probabilistic model. Most of the latest GAN models fit into a taxonomy follow a common framework, which is incredibly easy to follow. Maybe because its the last topic in the summer school and used many fundamentals that were covered by previous speakers (e.g. probabilistic models, kernals, deep learning etc), the seemingly complicated formulas are (slighly) more understandable.
He also discussed some open research problems with respect to GAN. Specicially,
- Quantitative Evaluation Metrics
- GANs for discrete data
- Estimation Uncertainty
- Practical Bounds
ML in Industry
Half a day of the summer school was reserved for industrial speakers to share their experiences in building ML applications and research in business context. Out of my surprise, the talks are NOT the regular marketing talks I was expecting. BBVA and Microsoft are big organisations doing customer facing ML applications and research. Prowler.io is a startup company focusing on commercialising Reinforcement Learning; while IIC is a local Spanish A.I. company speciaized in solutions for banks in real-time fraud detection.
BBVA is one of the major banks in Spain. The Data & Analytical Team (page) support AI initiatives across the bank. Some use cases are real-time transaction classification, expense forecasting, customer comparison engine (user modelling), recommender system etc.
The following lessons are particularly valuable for enterprise ML.
- The speaker, José Antonio Rodríguez Serrano, discussed on the distinction between ML research pipeline and ML products. BBVA aims to balance both ends, by actively publishing in academic conferences, e.g. NIPS, at the same time building customer facing deployable applications.
- The team consists of not only computer scientist and engineers, but also a dozen of members from other disciplines. The good mix of expertise ensures they are constantly asking the right questions. On the other hand, there requires a mix of considerations in the overall ML application landscape. A good ML model is not everything. To make a successful data product, areas like fairness, user experiences, and design requires equal or more attention.
- The team collaborates with other LOBs for development and operations. The timeline for an ML project is usually shorter compared to traditional software projects. The team needs to move on to new ideas and use cases, sometimes working on multiple projects in parallel. Therefore, the update and continuous improvement are performed by respective LOBs. Seems DevOps may not be the best choice for ML products.
Microsoft is one of the leaders in ML research. The speaker is Professor Sebastian Nowozin, the Principle Research Scientist in Microsoft Research in Cambridge, UK. Besides his excellent talk in GAN (slides), he also talked about how Microsoft approaches AI, following the AI principles.
- Designing AI Technology
Nowozin also discussed on how Microsoft Research accelerates research to product cycle, using HoloLens as an example. VR research started in almost 20 years ago in Microsoft, later found a synergy with Xbox Kinect, until the prototype of HoloLens, in 2015.
Important lessons learned: building a successful ML product from a hard problem takes patients, investment, and some lucks. Meanwhile, the research needs to be always aligned with applications.
If there is anything I hope MLSS could be improved, it would be some sort of hands-on session following a talk. For example, I am new to Gaussian Process but curious to see it at work. Being able to run some toy examples would be very useful.
There are two MLSS courses scheduled on 2019, for the first time in Africa continent (MLSS Stellenbosch, South Africa) and in London, respectively. I definitely recommend anyone interested to learn more about Machine Learning to apply.