Machine Learning is being used to solve many problems, which problems can you use it for?
Why is Machine Learning important?
In the last 5 years there has been growing success using machine learning. Rapidly increasing processor speed and access to large scale data sets are allowing for many new problems to utilize machine learning successfully. Today machine learning is being applied by innovative companies in almost every field.
Using machine learning to solve problems is becoming central to many companies core points of differentiation. At every, step in the development of machine learning, there will be huge economic payoff for the companies involved.
Machine Learning will be the basis and fundamentals of every successful huge IPO win in 5 years. — Eric Schmitt, Google Executive Chairman
What is Machine Learning?
Machine Learning is the development of computer programs that use the information in datasets to decide outputs.
Traditional computer programs explicitly define the steps to transform an input to an output. For example, when a sale is made for $2 and paid for with $5, give back $3 change. This is based on mathematical rules which can be executed without knowing anything about other transactions that have occurred.
Machine learning is useful for problems where we want to decide the output based on data. The model that determines the output creates patterns based on variations in the data; it is then trained by the data that passes through and ‘learning’ occurs which creates outputs closer to the goal.
Where Can Machine Learning Be Applied?
The type of problems that we want to use data to make decisions on (i.e. machine learning) are problems where rules of deciding the output are not clearly defined. These types of problems include human interaction and natural systems. These types of interactions are good candidates because there is always some degree of unknown and highly complex rules that will determine the outcome of a natural interaction.
When thinking specifically about where machine learning can be applied it is often valuable to identify what the natural impactor in the problem is. For example, in advertising we want to serve the ad that is most likely to get clicked. However, a human will make the decision to click on the ad or not, which is the natural impactor. Natural impactors do not have a clear set of rules, which makes them a good candidate to learn from the data. In this case, we want to learn about how people have acted with the previous ads to make a determination about which ad to show next.
This thinking pattern can be generalized as the following:
- What output do we want to reach?
- Is calculating this output dependent on a human perception/natural system?
A few examples using this pattern:
- We want to determine which emails are spam. Yes this is based on human perception because the email is only spam if the user would determine it as spam.
- We want to determine the amount of change to give back. No, this is a math equation, it is based on formal math rules and only needs the input of cost and payment amount to calculate change returned.
- What animal is in the picture? Yes, an animal will exist in a natural system. Unless the same pictures are used, the pictures taken of an animal in a natural system will continue to have different variables.
- Which driving maneuver will provide safety and follow laws? Mostly yes. While traffic laws may be a well defined set of rules, the best actions for providing safety in a natural system are not defined.
How can Machine Learning be applied?
Once we have framed our problem, understanding the goal and the decision we are trying to make using data, we can begin to explore the data/feedback that will be provided to the system. Machine learning problems are classified into three buckets depending on the nature of the learning “feedback” available to the system. These are:
- Supervised learning: Looking to predict a future value based on existing data, known as a Training Set.
- Unsupervised Learning: Problems, when given a data set with no rights and wrongs will look for patterns in the data.
- Reinforcement learning: A computer program interacts with a dynamic environment in which it must perform a certain goal (such as driving a vehicle), without a teacher explicitly telling it whether it has come close to its goal. Another example is learning to play a game by playing against an opponent.
Often times the most difficult part of effectively using machine learning is creating a suitable data set. A good data set will be standardized and large. Supervised problems require a training data set, which is a collection of data points that have a label defining the correct answer. This training data is used to ‘train’ the system for data where the answer is not defined. Examples of training datasets are the MINST and ImageNet public datasets.
Once we have our standardized data we can begin to train our model. There are many machine learning libraries that exist, the most popular lately is TensorFlow, which has been open sourced by Google. TensorFlow offers examples that span many common machine learning problem types. Many of these models can be reused to solve different but structurally similar problems. These libraries will allow you to quickly experiment and apply machine learning with near to state of the art algorithms.