An Introduction to Machine Learning

Anyim Amarachi
Analytics Vidhya
Published in
6 min readAug 1, 2020
Photo by Franck V. on Unsplash

Say you are practising basketball on your own and you are trying to shoot the ball into the hoop. If you fail at the first try, your first instinct would most probably be to move forward or backwards, maybe jump higher or go lower, or even stretch your hands properly. Thing is, whatever you do, you are trying to get that ball into the basket. If it does not work, you keep trying new tactics to eventually reach your goal. This is the concept of machine learning.

Machine learning is an application of artificial intelligence that provides systems with the ability to automatically learn and improve from experience without being explicitly programmed. It focuses on the development of computer programs that can access data and use statistical analysis to predict an output while updating outputs as new data becomes available(ie learn).

“A computer program is said to learn from experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.”- Tom Mitchel

Classification of Machine Learning

There are various categories of machine learning. They are:

  • Supervised Learning
  • Unsupervised Learning
  • Reinforcement Learning

Supervised Learning: Here, the system has been supplied with previously labelled data so it can apply what has been learned from those labelled examples to new data to predict future events. It is like someone trying to memorize new facts while comparing it to a note. This learning algorithm can compare its output with the correct, intended output and find errors in order to modify the model accordingly. A typical example would be email classification as spam, where you already have some emails that have been labelled “spam”, and you classify new emails as spam or not depending on whether they have the same qualities as the spam mails. Regression is another type of supervised learning.

Unsupervised Learning: Here, the system is presented with unlabeled, uncategorized data leaving to the algorithm to determine the data patterns on its own. The system doesn’t figure out the right output, but it explores the data and can draw inferences from datasets to describe hidden structures from unlabeled data. Recommendation systems usually seen on the web in that does marketing automation are based on this type of learning. Clustering and association are types of unsupervised learning.

Reinforcement Learning: Here, you present the system with examples that lack labels as in unsupervised learning, but this time around, you accompany an example with positive or negative feedback (a reward system) according to the solution the algorithm proposes. It is a type of dynamic programming that trains algorithms using a system of reward and punishment. This method allows the algorithm or agent to automatically determine the ideal behaviour within a specific context in order to maximize its performance. The learning algorithm, or agent, learns by interacting with its environment and is typically seen when computers learn to play games, outperform human players, and even optimize its score.

Choosing the Right Machine Learning Problem

You have collected a bunch of data and want to use machine learning techniques to analyse this data, how do you choose the right machine learning problem for your use case? The problem categories we will cover in this article are:

  • Classification
  • Regression
  • Clustering
  • Dimensionality reduction

Classification: When you need to classify your input data into categories or classes, it turns out that predicting categories is a very common use case and these categories could be virtually anything. Like I mentioned in the email example above, is this email “spam” or “not spam”? Should you send it to the “inbox” or “spam” folder? As a financial trader constantly monitoring stock markets, given past information on the market, company performance, stock performance, should you “buy”, “sell” or “hold”? Or say you are working with image data and want to do object recognition, is this a “cat”, “mouse” or “dog”. The list is endless, but we can see that the output of a classification model is one category or class.

Regression: When you want your model to predict continuous numeric values, you would want to use a regression model. As a financial trader, given current market sentiments, previous earnings of the company and you need to predict the price of the stock tomorrow, then a regression model is your guy. You might be analysing the performance of different cars available given the attributes of a car and you want to predict its mileage or even trying to predict the price of a house considering the location and other conditions of the house. Once you are able to observe the nature of the problem, it is easier to know what to use.

Clustering: When you have a really large dataset with no idea of what is in it, to make some sense of it, you may want to try clustering. In social media ads targeting, finding users that are interested in a particular field so you can target specific ads to them is an application of clustering. Another one is document discovery, you could gather all documents related to armed robbery and see if you can find patterns in the cases. Clustering just allows you to self discover patterns in fine details.

Dimensionality Reduction: This is a preprocessing technique used to perform feature detection on your data. Let’s say you have 500 different variables, which of them are most significant? What features do you pay more attention to? This is where dimensionality reduction comes to play. It is used to preprocess your data to build more robust machine learning models with better performance whether they are classification, regression or any other kind. Dimensionality reduction helps us find latent factors when we have large data and no target values.

Conclusion

Machine Learning comes into the picture when problems cannot be solved by means of typical approaches. It enables the analysis of large data delivers faster, more accurate results in order to identify profitable opportunities or dangerous risks.

This article is intended to just give an introduction to the concept of Machine learning. There is a lot more to learn and it can be done by wanting to learn, creating time and finding the right resources online. I hope I have been able to make you want to learn more/

--

--