Understanding Basics of Data Science and Machine Learning (non-technical)

Akshay Toshniwal
Machine Learning India
6 min readAug 3, 2021
Image by Machine Learning India

We all are aware of the pace at which data is growing every day, every second, and every moment. This rise in data has made organizations think about how the data can be made use in a manner that helps in a better decision-making process and solve complex problems more simply.

Data Science

To do that, an area of data science was born. Data Science is an upgraded version of what a business intelligence person or a data analyst can do. A data science professional is termed a Data Scientist and his/her primary responsibility is to play around with data, understand it, process it, and create insights that help build value for the business.

By understanding that, the definition of Data Science in simple terms can be as follows:

A field of study that deals with data in and out and tries to generate value in multiple ways that help businesses or individuals in improving their work and processes. An area that helps identify hidden insights within the data. It includes everything right from the extraction of data to visualizing the data in terms of graphs and charts, and more than that.

A person is termed a Data Scientist who is performing the following tasks (but not limited to), such as:

  1. Extracting information (data) from different sources (like websites, textual documents, PDFs, Excel and CSV files, multimedia data, and more)
  2. Understanding the value (importance) of the data extracted
  3. Processing the data in a certain structured (organized) format can be further used to generate value and insights out of it.
  4. Creating visuals out of the data (visuals meaning charts, graphs, pie charts) that help in understanding the data in a better way.
  5. Performing automation, scheduling certain data tasks, and building models (mathematical models and systems {computer algorithms} that help in generating accurate and efficient insights)

I am sure by now, you must be very well aware of what field is data science and what are the primary responsibilities of a data scientist. Let’s go one step ahead.

Growth of Machine Learning

Right from the year 1959 (when the term Machine Learning term was first coined), to the year to date, the field has come a long way. Previously these were more of a rule-based system that had certain conditions and accordingly inferences (conclusions) were made. Today, modern machine learning algorithms focus on two primary things: Prediction and Classification

Companies such as Google, Facebook, Amazon, Microsoft, and many others make use of AI-based technologies to serve their customers more smartly. Smart voice assistant systems like Amazon Alexa, Apple Siri, Google Assistant make use of AI and ML that learns about user inputs and their respective speech and accordingly generate outcomes and results for the end-user. This is just one small example of how ML is being used today.

Another day-to-day life example includes the usage of intelligent washing machines, smart refrigerators, and smart televisions. These are a few of the appliances that we use every day. In this, the term smart means intelligent, automated, and an ability to learn based on user inputs.

Today, technology advancement has gone to another level, and so has the development in the area of machine learning. Let us understand what this field is all about?

Machine Learning

I am sure no matter which field you belong to, you must have heard about this term, as this is creating a lot of buzz around the IT and technology industry.

Machine Learning is nothing but making your machines, systems (computers, machines, and software) smarter and efficient by making them learn things in the way we humans do. ML is one of the biggest subsets that fall under AI and an upgrade that stands above Data Science. You may understand it much better by taking a look at the below Venn diagram

A basic Venn Diagram highlighting how Data Science, Machine Learning, and Artificial Intelligence are related to each other.
Created by Akshay Toshniwal using Canva

To define Machine Learning in the simplest terms, it can be defined as follows:

Machine learning is an area that combines data science capabilities and AI to makes machine learn and think like we humans do in order to generate better data outcomes. These outcomes can be prediction, classification, or cluster formation (grouping).

A person with skills focused on machine learning is termed a Machine Learning Engineer or AI Engineer. The terminologies can change depending on the organization that is working in AI/ML area, but the responsibilities remain the same.

A few of the responsibilities that are to be taken care of by an ML Engineer are as follows:

  1. Data preprocessing to a deeper extent where it involves understanding what parameters (values) from the dataset are to be retrieved.
  2. Selecting or choosing the machine learning algorithm to perform a certain task. Machine Learning has multiple algorithms (methods) to generate different data outcomes. To make data useful, anyone or many algorithms are selected.
  3. Machine Learning involves the learning of systems, thus, the system is trained (made to learn) by the training data and it is tested or validated for efficient results by the validation data.
  4. Few iterations are further made to make the system more efficient. In terms of learning as well as generating outcomes.
  5. Lastly, machine learning systems are deployed on a cloud infrastructure (cloud environment over the internet) to make it more scalable and reliable.

Understanding both, Data Science and Machine Learning, I am sure it is very clear how data is being used nowadays to build better businesses. The power in data is tremendous and the one making use of this power ethically will eventually win the game.

Why Data Science and Machine Learning are needed today?

As stated in the earlier sections, the growth at which data is growing is tremendous. A brief highlight can be seen in the graph below.

The graph below highlights the volume of data/information created, captured, copied, and consumed worldwide from the year 2010 to the year 2024. As you can see, the data consumption is simply increasing year-on-year and it is thus extremely important to manage and process this huge amount of data.

The benefits of doing so are that processing and analysis will lead to the generation of hidden and valuable insights out of the data which can be used to improve processes, businesses and other areas of work. Thus, it is high time that we need to perform a certain kind of analysis to structure real-time data, make it to use, and build insights out of it.

A graph showing the increasing volume of data created and consumed from the year 2010 to the year 2024 by Statista
The volume of data/information created, captured, copied, and consumed worldwide from 2010 to 2024. Statistics by Statista

Taking the growth of data into consideration, the field of data science is extremely important. Also, the only work to be done is not simply organizing the data, but also further analyzing, predicting, classifying, or clustering information that improves the overall decision-making process. This is done through machine learning.

Machine Learning is growing and today it is used with another artificial intelligence area called Artificial Neural Network. When ML is used with a neural network, it gives birth to Deep Learning (DL).

Deep Learning is another promising area that helps in working with images, audio, video (multimedia data), textual information, and a lot more.

In my next article, I shall be talking about deep learning and what it is all about. That too will be a non-technical article so that the technology is understood by students and professionals working in other areas.

I hope this article is useful and helped you gain some basic understanding of Data Science, Machine Learning, and how they impact businesses.

--

--