Understand Machine Learning

Kayvan Kaseb
Software Development
8 min readApr 26, 2020
Photo hosted by Forbes

Recently, a growing interest to Machine Learning forced the companies to focus and invest on it. Besides, this major is considered an exciting research topic for scholars in Computer Science. Machine Learning (ML) is the major of computational science, which focuses on analyzing and interpreting patterns and structures in data to enable learning, reasoning, and decision making outside of human being interaction. This essay aims to discuss and introduce concepts, methods, and steps of Machine Learning.

Overview and Introduction

Nowadays, Machine learning (ML) is an exciting major of research in computer science and engineering. It is considered a subset of Artificial Intelligence (AI) because it enables the extraction of meaningful patterns from samples, which is a capability of human intelligence. The demand of having a computer that performs repetitive and well-defined tasks is clear: computers will perform a given task consistently and tirelessly, but these tasks would be difficult to accomplish for human. In recent years, machines have showed the ability to learn and even master tasks that were thought to be extremely complicated for machines, showing that machine learning algorithms are potentially useful elements of detection and decision support systems. Another exciting point is the finding that in some situations, computers seem to be able to observe patterns that are beyond human perception. This discovery has led to substantial and increased interest in the major of machine learning in various areas. At a high level, machine learning is the process of teaching a computer system how to make accurate predictions when fed data. Those predictions could be answering whether a piece of fruit in a photo is a banana or an apple, spotting people crossing the road in front of a self-driving car, whether the use of the word book in a sentence relates to a paperback or a hotel reservation, whether an email is spam, or recognizing speech accurately enough to generate captions for a video. The key difference between traditional computer software and machine learning approach is that a human developer has not written codes that instructs the system how to tell the difference between the banana and the apple. Instead, a machine-learning model has been taught how to reliably discriminate between the fruits by training on a large amount of data, in this instance likely a massive number of images labelled as containing a banana or an apple.

Artificial Intelligence and Machine Learning, the photo is provided by Oracle

What is the definition of Machine Learning (ML)?

Basically, the major of Machine Learning is concerned with the question of how to construct computer programs that automatically enhance with experiences. So, your answer is in your data. In fact, Machine Learning is considered as a subset of AI, which uses statistical methods to enable machines to improve with experience. It enables a computer system to make decisions to carry out a certain task. These programs or algorithms are designed in such a way, which they can learn and enhance over time by observing new data. The aim of Machine Learning is to derive meaning from data. Thus, data is the key to unlock Machine Learning. The more qualified data ML has, the more accurate the ML algorithm becomes.

Machine Learning (ML) is the science of computer algorithms that improve automatically through past experiences.

To make predictions or decisions without being explicitly programmed, Machine Learning algorithms build a model based on examples that is called training data.

Machine Learning is the extraction of knowledge from data.

Difference between AI and ML: The goal of Artificial Intelligence is to create a machine that can mimic a human mind, and it needs learning capabilities as well. However, it is more than just about learning; it is also about knowledge representation, reasoning and abstract thinking. In contrast, Machine Learning is solely focused on writing software that can learn from past experiences. Besides, Machine Learning is more closely related to Data Mining and Statistics than it is to Artificial Intelligence.

Machine Learning: Using data to answer questions. “Using data” is what is generally referred to as “training”, and also “answering questions” is referred to as “making predictions”, or “inference”.

Machine Learning methods

Even though there are a number of approaches are used in Machine Learning, the most popular ones are as follows:

  1. Supervised Learning

Supervised Learning is where you teach and train the machine using data, which is well-labelled. This means that the data is already tagged with the correct answer and correct outcome. Therefore, the greater the data set the more the machine can learn about the subject. After the machine is trained, it is given new previously unseen data and learning algorithm. Afterward, using the past experiences gives you an outcome. Supervised learning is commonly used in applications where historical data predicts future events. For instance, it can anticipate when credit card transactions are likely to be fraudulent or which insurance customer is likely to file a claim.

An example for Supervised Learning based on topics, picture by Google Developer Tutorial

2. Unsupervised Learning

Unsupervised Learning is where the machine is trained using a data set that does not have any labels or tags. The learning algorithms are never mentioned what the data represents. Unsupervised learning like listening to podcast in a foreign language, which you do not understand. In addition, you do not have any teacher and dictionary to what you are listening to. If you listen to just one podcast, it will not be much benefit to you. However, if you listen to hundreds of hours of those podcasts, your brain will start to form a model about how the language works you. Also, it will start to recognize patterns, and you will start to expect certain sounds. Initially, there are some techniques that are used in Unsupervised Learning such as self-organizing maps, nearest-neighbor mapping, and k-means clustering. The main goal is to explore the data and find some structure within.

An example of clustering in k-means algorithm, picture by Google Developer Tutorial

K-means clustering is one of the most popular unsupervised machine learning algorithms. that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean that is called cluster centers or cluster centroid.

A cluster refers to a collection of data points aggregated together due to specific similarities.

3. Reinforcement Learning

Reinforcement Learning is similar to unsupervised learning in that the training data is unlabeled. However, when asked a question about the data the outcome will be graded. For instance, If the machine wins the game, then the result is trickled back down through the set of moves to reinforce the validity of those moves. This does no mean that the computer plays just one or two games. If it plays thousands even millions of games, cumulative effect of the reinforcement will create a winning strategy. The goal is for the agent (the learner or decision maker) to choose actions that maximize the expected reward over a given amount of time. The agent will reach the goal much faster by following a good policy. As a result, the objective in reinforcement learning is to learn the best policy. Reinforcement Learning is often used for robotics, gaming, and navigation.

An example of Reinforcement Learning in dog training, picture is provided by this article

4. Semi-supervised Learning

Semi-supervised Learning is used for the same applications as Supervised Learning. However, it uses both labeled and unlabeled data for training. Basically, a small amount of labeled data with a large amount of unlabeled data are used in this approach because unlabeled data is less expensive and takes less effort to acquire. Additionally, this type of learning can be used with some methods such as classification, regression, and prediction. Semi-supervised Learning is useful when the cost related to labeling is extremely high to allow for a fully labeled training process. An examples of this approach is identifying a person’s face on a web cam.

7 steps of Machine Learning

  1. Data Collection: The quantity and quality of your data will directly specify how good your predictive model can be.

2. Data Preparation: Just having raw data is not very useful. In fact, the data require to be prepared, normalized, de-duplicated and errors should be removed. Visualization of the data can be used as a technique to find patterns and outliers to see if the required data has been collected or missed.

3. Choose a Model: There are a number of different models for various tasks and goals. So, you should choose the right model for your purpose based on your business goal. Besides, you should make sure how much preparation the model needs, how accurate it is, and how scalable the model is.

4. Train the Model: The objective of training is to answer a question or make a prediction correctly as often as possible. This means it uses your training data and incrementally enhance the predictions of the model. Each cycle of updating the weights and biases is considered as one training step.

5. Evaluate the Model: Some metric or combination of metrics to measure objective performance of model are used in this step. It means you should test the model against previously unseen to see how it performs.

6. Parameter Tuning: This step refers to hyper-parameter tuning, which might include number of training steps, learning rate, initialization values and distribution. In other words, you should set parameters to improve the process.

7. Make Predictions: As we know, Machine Learning is using data to answer questions. Thus, Prediction or inference, is the final step where we get to answer some questions.

In conclusion

Nowadays, Machine Learning (ML) is an exciting major of research in computer science and engineering. Also, it is used in various areas and fields in real-life during these days such as finance, CRM, self-driving cars, business intelligence, and so on. In fact, Machine Learning allows software to become accurate in predicting outcomes. Although there has been a tremendous progress in machine learning technology since this algorithm was first imagined 50 years ago, there are some issues are remained to address by researchers. This articles introduced and considered some concepts, approaches, and steps in Machine Learning.

--

--

Kayvan Kaseb
Software Development

Senior Android Developer, Technical Writer, Researcher, Artist, Founder of PURE SOFTWARE YAZILIM LİMİTED ŞİRKETİ https://www.linkedin.com/in/kayvan-kaseb