What is Machine Learning?

Übermensch
2 min readMay 21, 2023

What is Machine Learning?

Image from : https://www.totalphase.com/blog/2022/11/machine-learning-and-a-i-examples-pros-and-cons/

Machine Learning is a technique that teaches computers to learn patterns of input data and to predict outcomes. There are many algorithms applied and used for machine learning such as Decision Tree, Linear Regression, Naïve Bayes, ANN(Artificial Neural Network) and many more.

A main purpose of using machine learning algorithms with big data is :

  • Find the hidden relationships or patterns between independent and dependent variables

Independent variables
= Features , Columns except the last column or the target
Dependent variable
= Target, Mostly the last column

Let’s say, we have a table as below and we want to analyze whether this customer will buy this product or not, in order to narrow down the targeted customers for the marketing purposes.

In this case, our target is ‘Buy’ with Boolean data type as the last column, which can be called as dependent variables and mostly in the real-world datasets, targets are located at the last column for the data analysis convenience. On the other hand, columns of ‘Gender’, ‘Age’ and ‘Income’ are independent variables, also called as features.

When it comes to applying machine learning algorithms into datasets, it is recommended to exclude index or ID columns such as ‘ID’ on the above table, because they do not affect target values for data analysis and prediction and they are meant to be an identifier or an index by itself.

By looking through the table from ID 1 to ID 9, we can guess the tenth customer, who is female and has 3000 as income, is likely to buy the product with high possibility based on the previous data’s behavior such as ID 1, 2, and 7, which have the similarities as ID 10. With machine learning, this prediction process can be performed very quickly and easily with many datasets of a large volume, so called big data.

Imagine we have a sample dataset containing over 100,000 rows and 100 columns in a excel format. For the human’s naked eyes, it is either too hard or too long to explore and analyze the dataset’s patterns and interactive relationships at a glance.

With machine learning algorithms, however, we can find the hidden relationships or patterns of each data and can predict the future values such as market prices or customer’s preference trends, by analyzing data’s characteristics. That’s why we need to use machine learning techniques in this big data era, in which 2.5 quintillion bytes(at least)of data is produced every day.

--

--