12 great links on key Machine Learning topics in 2018

Published in

Own Machine Learning

6 min readJan 10, 2019

Just a nice picture of Stockholm, where I attended the 2018 edition of the International Conference on Machine Learning

As the year just ended, I thought of sharing a selection of 12 great links on key and practical Machine Learning topics that I came across in 2018. They’re all excellent reads, but I’m curious to know which ones you enjoy most — let me know!

ML is an area where you don’t want to reinvent the wheel, but to leverage other people’s work. APIs and ML services are essential for this. Unfortunately, it’s taking many companies a lot of time to understand this. In Why businesses fail at machine learning, Cassie Kozyrkov compares applied ML to baking (!) and makes the point that too many ML projects fail because the team doesn’t know whether they’re supposed to build the oven, the recipe, or the bread.
There are many tools that lower the barrier to entry to ML (you don’t need to build your own oven!), but a popular survey run annually by Kaggle identified “lack of clear question to answer” as a major barrier faced by ML practitioners at work. Therefore, before you spend too much time comparing ML tools and APIs, I recommend this article published on the blog of ML API provider Nexosis: Formulating a Machine Learning Question.
A fun exercise you can then do is to have a look at DataRobot’s list of ML use cases, organized by industry, and for each use case of interest, to think about the ML question behind it: what type of problem does it correspond to (classification, regression, etc.), what are the inputs/outputs, etc.
In practice, you won’t need to know that much about how ML algorithms work, to make a positive impact with ML. Tools like DataRobot and BigML automate the creation of ML models. What you do need to know is how to quantify the impact of ML models in your domain. In other words, you need a way to figure out which of two given models is best for you. Have a look at BigML’s tips to find an evaluation procedure and a performance metric that make sense for your application.
After designing your data collection and evaluation procedures, you should check against target leakage. It is a common problem in ML, that can completely mess up your results by making them wildly optimistic: your models will look as if they perform great, until you deploy them to production… Jake Shaver at DataRobot has made a great 5' video to introduce that topic: AI Simplified: Target Leakage. He estimates that target leakage can cost companies millions of dollars in the long run. Here’s my version of his loan application example: Imagine you collected a bunch of loan applications in 2018. You just realize that it would be useful to also use loan seekers’ FICO scores in order to best characterize them. If you didn’t get each person’s score at the time of the application, it would be too late to get them now: you would be using information from 2019, when the prediction was meant to be made at the time of application (2018). Consider now a churn prediction problem, where you would observe customers canceling their subscription in 2019. That should be linked to these customers’ snapshots from 2018, otherwise you would also be leaking information from the future.
ML automation (aka AutoML) leads to models that perform very well, but that are essentially “black boxes”. While they are fine to use in certain application domains, some people are uncomfortable with blackbox models. It’s interesting to see that, at the same time as AutoML techniques have progressed, so have techniques that aim at explaining the behavior of blackbox models. I think they can be very useful to “debug” models: by making sure that the behavior make sense, you can gain trust in the fact that the models will work well on production data. (One simple example of a model “bug” is when making predictions on a user based on the value of its ID — which should have zero impact.) Cassie Korzykov wrote another blog post that caught my eye this year: Explainable AI won’t deliver. I highly recommend it if you’re a fan of the idea of explaining the behavior of black boxes. Explainability should not prevent you from creating strong test procedures for your ML system (which I’ve seen happen). Also, if your initial feeling is that you can’t use blackbox models, I think it’s important to ask and understand why, and to make specific arguments.
Some commercial tools are doing a good job of providing ways to explain models and predictions. In the open source world, the What-If Tool has novel features to inspect model behavior visually (no coding required), such as Counterfactuals and Performance + Fairness. David Weinberger (one of our speakers at PAPIs 2016) provides an introduction in Playing with AI Fairness, with an example of a loan classification system. What-If Tool only works with TensorFlow models for now, but I hope it will be extended to support other libraries such as scikit-learn, and that it will provide vision for makers of commercial ML tools…
Creating models that you trust for production usage is only part of the story. You’ll need to deploy them. There’s a number of options here, as explained by Julien Simon in Mastering the mystical art of model deployment.
I’ve only mentioned proprietary ML platforms so far, but if you prefer to go the open source route, have a look at the Marvin platform. Marvin recently joined the Apache Incubator program and their creators presented it at PAPIs. They also wrote a technical paper on the design of Marvin, that we published in Proceedings of Machine Learning Research: Marvin — Open source artificial intelligence platform.
There’s tremendous potential in using ML tools and APIs to improve all areas of our lives, but there’s also a high potential for things to go very wrong at the “system-level”. In Artificial Intelligence — The Revolution Hasn’t Happened Yet, Michael Jordan mentions issues with a medical system that measured variables and outcomes in various places and times, conducted statistical analyses, and made use of the results in other places and times; the impact of that, in the example he gives, was the needless death of fetuses… Michael writes: “Just as early buildings and bridges sometimes fell to the ground — in unforeseen ways and with tragic consequences — many of our early societal-scale inference-and-decision-making systems are already exposing serious conceptual flaws.
Another source of problems for society is bias in ML-powered systems. Rachel Thomas had a great talk on that at QCon: Analyzing & Preventing Unconscious Bias in Machine Learning. Among other examples, she mentions predictive policing software which had a significantly higher rate of false positives for black people than for white people. This goes back to model behavior and fairness analysis.
Some ML systems can be more challenging than others to design and build. What should you start with? How can you transform your organization with ML and mobilize the right resources? Andrew Ng sheds some light on this in the AI Transformation Playbook that he published last month. He recommends the following 5 steps: execute pilot projects, build team, provide training, develop strategy, develop communications.

Which of these 5 steps would you say you’re at? Are there any other links from last year that you would have liked to see in this list?

Let me know if you have any questions or thoughts to share, after reading the above. I’d love to also chat more with you in 2019, if you can join us at our PAPIs conference on 24–26 June 2019 in São Paulo.

Happy 2019!

Louis

12 great links on key Machine Learning topics in 2018

Written by Louis Dorard