Today I am proud to be launching a new MOOC: Machine Learning for All, together with this blog to support the course. I have been working on the course for almost a year and, in a certain sense, even longer because it builds on over a decade of work on how to democratise machine learning.
The course is based on two fundamental beliefs of mine: that (almost) anyone can learn to do machine learning and that everyone should.
The first belief might seem strange given that machine learning is often seen as one of the most advanced sub-fields of computer science, filled with the brainiest, most mathematically adept researchers.
It is true that developing new machine learning algorithms involves a lot of maths and specialist knowledge. But most of the time doing machine learning is not about new algorithms, it is about using existing algorithms and code libraries to train machine learning models.
Machine learning systems learn to do things from data, a process called training. This it’s what makes it different from conventional computer science. It is not about programming algorithms, it is about training with data. This makes data the most important thing in machine learning. As Peter Norvig, Director of Research at Google, said: “We don’t have better algorithms. We just have more data”. It is an overgeneralisation to say that algorithms are irrelevant, but there are now a lot of very good learning algorithms out there that work for a variety of problems and I would predict that a lot of the expansion of machine learning in the next few years will be applying it to new types of data.
But what is this data? It is images, written text, speech. It is photographs, radiology scans, news articles, pop songs and conversations.
You don’t have to be a computer scientist to understand these things. The real experts on radiology scans are radiologists, not computer scientist. The real experts in news articles are journalists, not computer scientists. The real experts in conversations or human faces are, frankly, pretty much everyone.
So data is the most important thing in machine learning and you don’t need to be a computer scientist to understand data. You shouldn’t have to be a computer scientist to do machine learning.
Usable machine learning
So why is machine learning still the preserve of expert computer scientists?
I don’t think there is anything strange here. It is actually just the standard trajectory for computing technology. When a new software technology is developed, whether it is computer graphics, text processing or search, it starts out in research labs, developed by experts. You need good programming skills and probably maths to use it.
As a technology becomes established, it its made more usable. The development of fundamental algorithms gives way to Human-Computer Interaction (HCI), the study of how humans interact with computer systems and how to make those computer systems easier to use. The most obvious manifestation is the move from programming APIs to graphical user interfaces, but HCI goes much deeper into the design of the entire user experience.
Why hasn’t this happened for machine learning (yet)? I’m not entirely sure. It could be that machine learning is a particularly novel and difficult area of computer science and presents unique challenges in making it usable. It could be that the technology has only recently become practically useful enough for it to be worth thinking about HCI. Or it could be that the framing of machine learning as Artificial Intelligence is incompatible with the HCI framing of technology as a tool.
I’m not sure any of this is important. The key thing is that it only hasn’t happened yet. There is no fundamental reason HCI can’t be applied to machine learning, and it will happen, it’s only a matter of time. In fact, there is a group of researchers (myself included) working on HCI and machine learning, what I have called Human-Centered Machine Learning. We may be outnumbered by the researchers developing fancy new deep learning algorithms, but I believe that usability will be the most important development in machine learning in the next decade.
Usability is the basis of the course we have created. We didn’t just want people to learn about machine learning in theory, but to actually try it, and for that we needed to build a machine learning platform that anyone can use. My colleagues at Goldsmiths helped create a new online tool for learning how to do machine learning. It allows people to create simple classifiers for image recognition and uses a simple user interface inspired by my colleague Rebecca Fiebrink’s Wekinator and the Teachable Machine. This tool made it possible to teach machine learning to non-programmers, in practice and have students do practical project.
The Importance of Education
But we haven’t just built a tool, we have built a course. That is because education is still very important. Machine Learning is a very different way of thinking about computers, and, possibly more importantly, a very different skill.
The core skill of traditional computer science is programming. This involves a number of different elements, such as decomposing a problem into its parts, designing an algorithm, implementing that algorithm in code, testing and debugging. Having taught programming myself for many years I know that it is a difficult skill to acquire and that it requires a long time and lots of practice, but it does benefit from many years of experience of people learning and teaching programming. That means that there are well established pedagogies, including exercises graded by difficulty, example code to learn from and extend and also more abstract ways of teaching, like the explanation of algorithms.
Working as a machine learning engineer does involve programming (though, as I’ve just explained, I don’t think it always needs to), but there are several new skills that are specific to machine learning. It requires collecting and labelling data sets, selecting appropriate machine learning algorithms and tweaking their meta-parameters.
Like programming, it requires a lot of testing and debugging, but these are different skills. Machine Learning is probabilistic: it aims to give correct results in the majority of cases, but isn’t guaranteed to be correct all the time. That means that testing doesn’t mean getting the answer correct in all of a fixed list of tests, it means having a high accuracy percentage on a large testing data set. Debugging is even more different from programming as it often doesn’t mean diving into the details of the code to find and error, it would normally mean finding which items in the dataset are misleading, or changing the parameters of the algorithm.
These skills are very different from traditional programming skills, and are not well understood, even by experts. A lot of world expert machine learning researchers will still say that there is a lot of trial and error and ‘gut’ feeling in how to do machine learning effectively. If we don’t understand how to do it, we are even further behind on how to teach these skills.
A lot of machine learning courses focus on the traditional computer science elements: understanding what the algorithms do and how to implement them in a language like python. That is important if you are implementing new machine learning systems, but it doesn’t teach how how to do machine learning. Understanding the algorithm, of course, helps but most of the activities of a machine learning engineer focus on the data itself, so we need pedagogies that focus on the data skills.
This is what we’ve tried to do with the “Machine Learning for All” MOOC, and what many other educators across the world are working on. Only by working out how best to teach people the new skills of Machine Learning will this technology reach its full potential.
All my teaching comes from the belief that people should be able to learn about the latest technologies, and this is particularly true of my MOOC teaching, that brings this learning beyond the university classroom, but I think there is a particular ethical imperative driving the Machine Learning for All course.
Machine Learning is a technology that is little understood, and yet is likely to have a massive impact on ordinary people. The more that machine learning is an incomprehensible black box to the people affected by it, the less they can have any influence over the technology and how it affects them. Cathy O’Neil in her excellent book Weapons of Math destruction has detailed many of the ways that machine learning black boxes can have very negatives effects, particularly if we don’t question how they work.
The types of people who create machine learning systems are also very limited, because access to machine learning education is limited to the most mathematically highly educated. This means that these systems are created by people with expertise in maths and computer science, not necessarily in the data they are using, as I’ve mentioned above.
Even when you are dealing with data we are all expert in, like faces, the bias is still a problem, because machine learning engineers are typically a certain type of person: highly educated, middle class, mostly young, overwhelmingly male, and mostly in coastal USA, Western Europe and big city China. They will bring their unconscious biases to how they build their machine learning systems.
This is often about people collecting data near them, where it is more convenient. If you are creating a face recognition system in Europe, you are likely to collect images of faces in Europe, but this is likely to mean your system will work well on European faces, but less so on Asian or African faces, and it has been well established that this is the case with many face recognition systems (though the rise of machine learning in China is massively improving recognition of East Asian faces, a direct illustration that changing who creates a system can change how it works).
Biases can also be more conceptual. A young, male, middle class machine learning engineer in a big, wealthy city, might well not think about features or problems that would be important to older people, women, working class people, and people living in rural environments or poor countries.
I’m not saying that the engineers would be consciously sexist, racist or ageist. It’s just that people’s life experiences shape their way of thinking, so limiting the creation of machine learning to certain types of people will limit the ideas that go into the design of systems. That can make them less suitable for other types of people, but also limits the overall creativity of the field.
There has been a lot written, and quite rightly, about the dangers of machine learning. I’m not talking about the risks of some future, super intelligent AI, which I find quite fanciful, but the very real problems caused by the use of Machine Learning right now or in the very near future. Machine Learning is a technology that could bring massive benefits to humanity, but the problems need to be addressed. I believe that, in part, these problems are due to the general public having very little say in the technology because they don’t know how it works. One of the best ways of addressing these problem is to make sure that as many people as possible understand enough about machine learning to have a say in how it is designed and used and that many more people have the skills to design machine learning systems. That is why we created the “Machine Learning for All” MOOC.