Data Science for Executives

Machine Learning Basics

Jhimli Bora
Hashmap, an NTT DATA Company
4 min readJan 26, 2021

--

What is Machine Learning?

It is the capability of artificial intelligence systems to learn by extracting patterns from data without being explicitly programmed.

Ok, but what is artificial intelligence?

From a high level, it is that machines can be programmed to simulate human and animal intelligence. And in doing so, learn new behaviors that were not explicitly taught by their designer/ programmer.

In traditional programming, a program (set of rules) is executed on a computer to produce the output. This output is completely deterministic, regardless of data.

In machine learning, the data supplied to a running program can, and will, fundamentally change the program's behavior. Properly machine learning is a subset of artificial intelligence.

Machine Learning Applications:

Machine learning is being used increasingly in various fields. Listed below are a few of its uses:

Image Processing: image tagging and recognition, self-driving cars, optical character recognition (OCR)

Robotics: human simulation, a humanoid robot, industrial robotics

Data Mining: anomaly detection, grouping, and predictions, association rules

Video Games: pokémon, alphaGo

Text Analysis: sentiment analysis, spam filtering, information extraction

Healthcare: medical imaging & diagnostics, patient data & risk analytics, lifestyle management & monitoring

Types of Machine Learning Methods

There are three popular types of machine learning methods.

Supervised Learning: When we have a dataset with known outputs or labels, we can use an algorithm type known as supervised. The supervision comes from known outputs serving to teach the algorithm the correct responses.

An example of a labeled training dataset:

sample training dataset (kaggle-pima indian diabetes database) for predicting diabetes based on various features

Note: In the above image, the features are: pregnancies, glucose, blood pressure, skin thickness, insulin, BMI, diabetes pedigree function, age. The label is the outcome (1- yes, 0- no).

In other words, the label corresponds to what we will try to predict for un-seen data. In supervised learning, we will always have the prediction/label available in the training data, which will train the model.

There are two types of supervised learning: Regression (predicting a continuous value)and Classification (separating the data into categorical classes).

Unsupervised Learning: This is the opposite of supervised learning. Here there is no known ground truth, and the aim is to infer what may be a ground truth, and in doing so, the aim is to identify like elements and place them in clusters/groups.

An example of an unlabeled training dataset :

sample training dataset (kaggle- mall customer segmentation data) for segmenting customers into different groups based on shopping trend

Note: There is no label in the training dataset; the model has to make sense of the data based on similarity.

Reinforcement Learning: This is a type of machine learning where models are trained by rewarding desired behaviors and/or punishing undesired ones. For example, if a kid messes up his room, you take away some of his favorite food, or every time he cleans his room, you give him his favorite food. What will the kid eventually learn? The kid will behave to maximize the amount of his favorite food you give him as feedback.

An example of reinforcement learning is the AI poker bot.

It starts off playing poker randomly, and as it understands which actions win it more money, it improves. After each hand, it retrospects how it played and checks whether it would have made more money with different actions, such as raising rather than sticking to a bet. If the alternatives lead to better results, it will be more likely to choose that similar situation.

Final Thoughts

Machine Learning is increasingly used in many industries today. An important point to remember while working on Machine Learning solutions - it's not about using the best and latest algorithm to build a model. It’s still about solving a business problem.

A successful machine learning project's foundation is understanding the business problem, quality of data chosen, and best-fit architecture to industrialize the model on-prem or in the cloud.

Now is the time to invest in machine learning solutions. At Hashmap, an NTT Data Company, we have the right expertise and tools to guide and help businesses succeed with machine learning solutions and initiatives. Reach out to us here.

Other Tools and Content For You

Jhimli Bora is a Cloud and Data Engineer with Hashmap, an NTT Data Company, providing Data, Cloud, IoT, and AI/ML solutions and consulting expertise across industries with a group of innovative technologists and domain experts accelerating high-value business outcomes for our customers. Connect with her on LinkedIn.

--

--