An introduction to Machine Learning

For Curious Programmers

Published in

IO Digital

4 min readJun 27, 2017

Machine learning is transforming the world around us and enabling technology that previously only existed in sci-fi movies.

By Alex Conway

Instead of needing to hand-code rules to solve challenging problems such as computer vision, speech recognition and self-driving cars, programmers can build systems that enable computers to “learn” how to solve these problems on their own. Put simply, machine learning is a type of artificial intelligence that enables computers to learn from data. Give an algorithm enough examples of solutions to a problem and it can learn how to predict solutions for new unseen examples.

This type of problem where we have solutions for examples (so-called “training data”) is called a ‘supervised learning’ problem (in contrast to ‘unsupervised learning’ problems such as clustering). A simple example is something like predicting the price of a house given input “features” (just columns in a spreadsheet) such as the number of bedrooms, bathrooms, garages, the square foot area, postal code, etc. There are many algorithms that can be used to solve supervised learning problems from regression to random forests, support vector machines, and neural networks but fundamentally all we are doing is trying to create a function that “learns” to map some input data to an output solution.

The way this is done is to define a “loss function” (a.k.a. objective function) and then we’ll use some algorithm to minimize this objective function. Typically we will divide our data into ‘training’ and ‘validation’ sets (typically a 80/20% split) and then we “train” our model using the training set and evaluate its performance against the validation set. Training the model really just means tweaking its parameters incrementally (usually using gradient descent) so that the loss function decreases until we have ‘minimized’ the loss function. In the house price prediction problem described above, we want our predictions to be as accurate as possible so a common loss function is to minimize the average squared error between our predictions and the known labels (because remember, we’re using our model to predict labels on the validation set for which we know the true labels).

Coming up with loss functions to map an image input into a label output (for image classification) or map an input sound wave into a text output (for speech to text) is more tricky but fundamentally the same idea: build a black box that we feed examples and then use the black box to predict labels for new examples.

It’s possible to create such a function for almost any kind of input:output pair, for example, we can use machine learning to create functions that map:

→ a sentence in one language -> sentence in another language (Google translate)
→ photograph -> list of objects / people in the photo (Google photos)
→ audio signal -> text (voice to text) (Siri / Alexa / Cortana)
→ a document -> a sentiment score
→ past loan outcomes + data on an individual -> credit score
→ past e-commerce purchase + visit data -> product recommendations

Part of why machine learning has become so popular is there are many open source libraries that implement black-box algorithms that can be used with just few lines of code to get state-of-the-art results without needing a math PhD. It is important to understand how to pose a problem and transform input data (‘feature engineering’) but there are many free resources available online that motivated programmers can use to build machine learning into their projects.

The popular Coursera Machine Learning course lectured by Andrew Ng is a good place to start, as is just trying to plug your own data into examples from the python scikit-learn library’s documentation. You could also just search “machine learning api” and use an external API service until you realize how easy it is to get started and take your first leap into building some mind-bending, sci-fi solutions of your own.

About the author

Alex is the founder and CTO of NumberBoost, a startup that builds interactive machine learning tools and does data science consulting. This article is an adaptation of his talk on the basics of machine learning at the latest #IOPowwow. He is on twitter @alxcnwy.

IO Powwow meetups
IO hosts the popular #iopowwow tech meetup every last Friday of the month in Cape Town. The event covers tech topics that inspire curiosity and a yearning for learning, such as how to conduct an effective #NanoSprint (a methodology pioneered by IO) based on the Google Design Sprint methodology, how to get into #MachineLearning, and how to make games using Javascript. For more information about the IO Powwow meetups visit us on Medium.

An introduction to Machine Learning

For Curious Programmers

Written by IO