As we randomly search terms on the internet, we often encounter “machine learning” and “deep learning” and how they are revolutionizing the way in which we live our lives. At present, machine learning is almost used everywhere from self-driving cars, email spam detection, recommender systems that we see in Netflix and Amazon, credit card fraud detection used by banks and so on. The list goes on and on with potential new applications being created. Therefore, it is very important to stay updated with the latest trends and understand what machine learning actually is and get a good broader understanding of some of the types of machine learning. In this article, I would explain machine learning and the different categories of machine learning. In addition, we would also discuss about some of the basic limitations of machine learning.
What is machine learning?
Machine Learning is a process of teaching computers to learn from data and take decisions without explicitly programming to do so after training. Generally, we have to first train the machine learning models before we can get them to take decisions on their own. Therefore, one of the most important things needed for machine learning algorithms is data. Without data, there is no essence of machine learning algorithms. Based on the data that we feed to the machine learning models, they would take the data and understand it before giving predictions for the new data. Therefore, we have to provide data that is reflective of the real-world. This is because machine learning models would completely depend on the data that we are giving to the models to make future predictions.
What typically happens in machine learning is a bunch of mathematics where there is product multiplication, feature scaling and normalization and so on. Therefore, the data that we feed to the machine learning algorithms should be in the form of vectors or numbers rather than text or the other forms where a machine cannot understand. Some of the examples where a machine cannot understand data is text and alphabets. Therefore, we have to convert all the text and alphabets in the form of numbers and feed them to the machine learning models for training and predictions.
Machine Learning Approach
The most typical machine learning approach would be to first split our data into two sets: train and test set. We have to first convert the entire data in the form of mathematical vectors and split the data. After splitting the data, the training set is fed to the machine learning models. After a sufficient number of iterations or epochs (one epoch is sending the entire training data once), we would be using the machine learning algorithms for predictions from the test set. We would then see how well the machine learning models performed on the test set (it is really important that machine learning models perform well on the test set).
There is a possibility that machine learning models perform really well on the training set and not so well on the test set. This is an example of overfitting. In this scenario, the machine learning models learned too much from the training data without being able to generalize. Therefore, they are fitting very well on the training data and they are able to give good predictions on this set. When we try to get the predictions from the test set, however, the machine learning models fail as they have learned and fitted their parameters on the training set rather than being able to generalize on the test data (new data). Therefore, we have to always take note of the accuracy of not just the training set but also the test set.
Sometimes, the machine learning models do not fit very well with the training data itself. As a result, they perform very poorly on the training data when we consider some metrics such as accuracy, recall and precision. This leads to poor performance on the test set as well. The machine learning models performing poorly with the training data might be due to insufficient data, uncorrelated features and so on. Therefore, we have to ensure that we train the machine learning models to the fullest and ensure that they do well not just on the training data but also on the test data respectively.
Why did Machine Learning gain a lot of traction in the recent years?
Machine Learning algorithms were proposed far back along with neural networks. On those days, however, there was not enough data to make use of these algorithms. In addition to this, the computational power needed to run these algorithms was pretty much limited. Today, however, there is a lot of data generated in companies and also the computational resources available are amazing. Having a look at some of the services provided by Google (Google cloud) and Amazon (Amazon Web Services) would give us a good idea about the computational power that we have at present. We could use all these technologies without setting up the infrastructure (hardware) for them to run as this is provided by the above mentioned companies. Thus, there is a huge demand for machine learning and deep learning. Lot of companies are investing huge sums of money in machine learning research to enhance their productivity and increase the revenue from the products using machine learning.
In the world where we have a lot of data, it is important to understand machine learning and deep learning in order to get the predictions for different use cases. Looking at the data that we have and how the companies are leveraging the data, we get to understand that the more we learn the machine learning models, the better we can be able to build value to the company and ensure that they get profits.
In the above image, we see a robot that appears to be highly intelligent and human-like and we see that a kid is interacting with it. We could understand from this that machines are advancing really well in the modern era and we would be looking for future advancements in machine learning and deep learning. Most of the jobs that are redundant would be replaced by machine learning and deep learning models. We are sure that with the technological advancements, we would see that machine learning could be used in different fields such as agriculture, finance, health and social networking industries. Therefore, it is worth spending time and effort to master the machine learning concepts and use them in real-time respectively.
Types of Machine Learning
- Supervised Learning: In this type of machine learning, we know the output labels and we would train the machine learning models and evaluate them based on the output they have generated and compare it with the actual output. This would ensure that we train the machine learning models based on the output values that we know and thus, we can evaluate the performance of the machine learning models. For example, if we want to predict the prices of houses based on the previous data which contains input features and output of the house prices, we would be able to train the machine learning models and evaluate their outputs with the actual output house prices. This would ensure that we train the machine learning models until perfection and this is called supervised learning.
- Unsupervised Learning: In this machine learning approach, we do not know the output and we would train the machine learning models and they would be able to identify the pattern and understand the data. One of the popular examples of unsupervised learning is customer segmentation where we would group customers on the basis of their behavior in certain scenarios.
- Semi-supervised learning: In this approach, we would have data of some output for a few data points while no output for the other remaining data. We would be training the machine learning models on the training data which has output and later, ask the models to classify and get pattern on their own on the data that does not have an output. One example of using semi-supervised learning is text document classification. We would be first training with the known outputs and ask the machine learning models at hand to segment and classify the remaining data where the output is not known.
What are the limitations of machine learning?
- One limitation is the curse of dimensionality. What this means is that when we are giving more number of features to the machine learning models, we would be leading the models to take more time to train and implement. This would often result in weak performance. As a result, there is a delay in training which would intern lead to a longer development time for productionizing the machine learning models.
- Overfitting can be an issue in machine learning. This is a process where the machine learning models would be very well able to predict the output on the training data but when it comes to the test data, they often fail to generalize on the new data that we would be using as a test set. This is known as overfitting.
Therefore, we have covered machine learning and the types of machine learning. We have seen some of the limitations of machine learning when being implemented in real-life. We have also seen the different types of machine learning under a broad category. Hope this article gave a good intuition of machine learning. Feel free to clap if you found this article interesting and helpful. Thank you.