The Explanations of Data Science, Machine Learning, Artificial Intelligence and Deep Learning.

9 min readNov 16, 2022

First of all, what are the differences between Machine Learning, Data Science, Deep Learning and Artificial Intelligence? These are the simple definitions of them:

Data Science(DS): Data Science integrates all the above terms — AI, ML & DL to extract insights from data and make predictions from large datasets. *Note that the distinctions between these terms aren’t clear-cut.

Artificial Intelligence(AI): A program that can sense, reason, act and adapt. Programs with the ability to learn and reason like humans.

Machine Learning(ML): Algorithms whose performance improves as they are exposed to more data over time.

Deep Learning(DL): Subset of ML in which multilayered neural networks learn from vast amounts of data.

TYPES OF ARTIFICIAL INTELLIGENCE (AI)

Artificial Intelligence can be divided based on capabilities and functionalities.

There are three types of Artificial Intelligence-based on capabilities.

Narrow AI
General AI
Super AI

Under functionalities, we have four types of Artificial Intelligence.

Reactive Machines
Limited Theory
Theory of Mind
Self-awareness

TYPES OF MACHINE LEARNING (ML)

Machine Learning is often categorized by how an algorithm learns to become more accurate in its predictions. There are four basic approaches:

SUPERVISED LEARNING: that’s defined by its use of labelled datasets. Using labelled inputs and outputs, the model can measure its accuracy and learn over time. There are two main tasks:

Classification (Binary and Multi-class Classification ): Dividing data into two categories and the other classification which is multi-class classification is choosing between more than two types of answers.
Regression : Regression is another type of supervised learning method that uses an algorithm to understand the relationship between dependent and independent variables. Regression models are helpful for predicting numerical values based on different data points, such as sales revenue projections for a given business.

Example of Supervised Learning Algorithms:

Linear Regression
Logistic Regression
Nearest Neighbor
Gaussian Naive Bayes
Decision Trees
Support Vector Machine (SVM)
Random Forest

2. UNSUPERVISED LEARNING: Unsupervised learning uses machine learning algorithms to analyze and cluster unlabeled data sets. These algorithms discover hidden patterns in data without the need for human intervention (hence, they are “unsupervised”). There are four main tasks:

Clustering: Splitting the dataset into groups based on similarity.
Association: Identifying sets of items in a data set that frequently occur together.
Dimensionality Reduction: Reducing the number of variables in a data set.
Anomaly Detection: Identifying unusual data points in a data set.

3. SEMI-SUPERVISED LEARNING: This approach to machine learning involves a mix of the two preceding types. Semi-supervised learning is ideal for medical images, where a small amount of training data can lead to a significant improvement in accuracy. For example, Machine Translation, Fraud Detection, and Labelling Data.

4. REINFORCEMENT LEARNING: typically use reinforcement learning to teach a machine to complete a multi-step process for which there are clearly defined rules. Reinforcement is often used in areas such as Robotics, Video gameplay, and Resource Management.

TYPES OF DEEP LEARNING (DL)

Deep learning (also known as deep structured learning) is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised.

Here is the list of the top 10 most popular deep learning algorithms:

Convolutional Neural Networks (CNNs)
Long Short Term Memory Networks (LSTMs)
Recurrent Neural Networks (RNNs)
Generative Adversarial Networks (GANs)
Radial Basis Function Networks (RBFNs)
Multilayer Perceptrons (MLPs)
Self Organizing Maps (SOMs)
Deep Belief Networks (DBNs)
Restricted Boltzmann Machines( RBMs)
Autoencoders

Overview of An End-To-End Data Science Project( Steps, Libraries, IDE, Programming Languages, Datasets, etc.)

An end-to-end data science project concludes different steps. General using steps are shown on the below chart.

A list of the best Python IDE for data science and machine learning projects: Spyder, JupyterLab, PyCharm, Visual Code, Thonny, Atom.

For data science projects, several different programming languages are used like Python, R, SQL, Java, Julia, Scala, C/C++, JavaScript, Swift, Go, MATLAB and SAS (Are written by some of the top data science programming languages for 2022.) Some of them are used in computer programmes to implement algorithms. *In this article, libraries or other explanations that are generally used for Python are included.

In the data collection part, there are multiple ways of gathering data for instance online data sources, pulling data with API or accessing data in databases. One of the popular ways is online data sources which someone can access and download for free for data science projects. The other way to gain data is using requests and building an automated data pipeline between a website and the requester targeting a specific part of the website content. API (Application Programming Interface) helps. Data can be pulled on an automated schedule or manually on demand.

Here is a list of a few sources for datasets: Online Is Plural, Buzz Feed, NASA, AWS Public Data Sets, Wikipedia, Quandl, Data.World, World Bank Open Data, Kaggle, Google Finance, UNICEF publications, Our World in Data, Google Public Data Explorer, Five Thirty-Eight, Socrata, Github, UCI Machine Learning Repository, Data.gov, Academic Torrents, Nasdaq Data Link, Twitter, Youtube, wunderground, global health organization, pew research center, national climatic data center.

The most known is Kaggle which is a popular online platform with over 50,000 public datasets on a wide range of topics, can find easily all the data and code.

As another method, SQL (Structured Query Language) is a special-purpose programming language for managing data and accessing data in databases. With SQL, you can easily manage and seamlessly analyze large amounts of raw data. *Note that: Python is a general-purpose scripting language. SQL is a query language.

The main difference between SQL and Python is that developers use SQL to access and extract data from a database, while developers use Python to analyze and manipulate data by running regression tests, time series tests, and other data manipulation calculations.

For Machine Learning and Deep Learning projects, some libraries are used commonly. These are depending on their purposes.

TensorFlow
NumPy
SciPy
Pandas
Matplotlib
Keras
SciKit-Learn
PyTorch
Scrapy
BeautifulSoup

For data visualization, Matplotlib, Seaborn, ggplot, Plotly, and Bokeh libraries can be used in Python.

Power BI and Tableau are useful tools primarily used by data scientists and business analysts to extract valuable information from raw datasets and use it for business. These are a collection of various Business Intelligence and data analytics tools that allows the user to collect data from varied sources in both structured and unstructured format and convert that data into visualizations and other insights.

These tools save up a lot of time for Data Scientists by generating appealing visualizations in lesser time and without coding. Exploratory Data Analysis(EDA) is important for Data Science processes. A Data Scientist needs to be able to quickly visualize the data they’re dealing with before creating the model, and Tableau helps with that. But a disadvantage of these tools for data science is their’s visualization cannot be integrated into the platform.

Data Cloud Platforms: As data scientists deal with solving complex business problems through building models and deploying algorithms, the right kind of tools become essential to effectively manage different aspects of a project pipeline. Taking your data science projects to the cloud comes with advantages like the ability to scale, access to all the latest tools, and less maintenance from the user side. Some of the most common cloud-based platforms for data science projects include Amazon Web Services, Google Cloud Platform, IBM Watson and Microsoft Azure.

An operating system (OS) is system software that manages computer hardware, and software resources, and provides common services for computer programs. Examples of operating systems are Microsoft Windows, Mac OS X, GNU/Linux, BeOS, Android and IOS. Data Science can bring some difficulties so choosing an operating system is important for that.

With so much data being generated every day, it’s becoming increasingly difficult to manage using traditional methods. This has led to the development of multiple frameworks and technologies to help with the management and processing of big data. There are many different technologies that you can use to build a modern data infrastructure. Three of the most popular big data frameworks from the Apache Software Foundation: Apache Hadoop, Apache Spark, and Apache Kafka.

I tried to explain the definitions and sub-breakdowns for Data Science, Machine Learning and Artificial Intelligence with the help I got from the sources(mentioned in the references section). I tried to talk about the applications that can be encountered in a data science project or in this sector. In my next article, I will try to give examples from the industry. I wish everyone a good day.

REFERENCES

https://en.wikipedia.org/wiki/Deep_learning

How are AI, Machine Learning, Deep Learning & Data Science Related?

We all hear these terms being thrown around and often used interchangeably; some of us tag along without knowing what…

www.corpnce.com

Artificial Intelligence vs. Machine Learning vs. Data Science

Today, it seems like the terms Artificial Intelligence (AI), Machine Learning (ML) and Data Science are everywhere and…

www.deviq.io

Supervised vs. Unsupervised Learning: What's the Difference?

In this article, we'll explore the basics of two data science approaches: supervised and unsupervised. Find out which…

www.ibm.com

https://www.datacamp.com/blog/top-programming-languages-for-data-scientists-in-2022

12 Data Science Projects To Try (From Beginner to Advanced)

Sakshi Gupta | 14 minute read | February 13, 2022 From breast cancer detection to user experience design, businesses…

www.springboard.com

Running a Data Science Project: Data Gathering

When it comes to anything data-related, the most important component is the data. The quality of performance of any…

www.linkedin.com

24 Free Datasets for Building an Irresistible Portfolio (2022)

In this post, we'll show you where to find datasets for various projects in the following fields: Data science Data…

www.dataquest.io

20 Data Science Projects with Source Code for Beginners

The demand for data scientists is incredibly high. Employers are desperate for data scientists, and recruiters have a…

www.dataquest.io

How to pull data from an API

API (Application Programming Interface), is a software communication protocol that allows two applications to talk to…

acho.io

6 Best Python IDEs for Data Science & Machine Learning [2022] | upGrad blog

Home > Data Science > 6 Best Python IDEs for Data Science & Machine Learning [2022] An IDE (Integrated Development…

www.upgrad.com

SQL Becerileri Size Daha İyi Veri Bilimi Fırsatları Sağlar - IoT Türkiye | Türkiye'nin En Büyük…

Hangi boyutta uğraştığınıza bakılmaksızın, verilerin analiz edilmesi genellikle karmaşıktır. Çoğu veri bilimci ve…

ioturkiye.com

Top 10 Python Libraries for Data Science for 2023

Python is the most widely used programming language today. When it comes to solving data science tasks and challenges…

www.simplilearn.com

How is Tableau helpful for Data scientists? - Intellipaat

Data Science is one of the hottest career paths in the 21st century and the reason for it is the growth in internet…

intellipaat.com

https://analyticsindiamag.com/how-to-use-cloud-platforms-for-your-data-science-projects/#:~:text=Some%20of%20the%20most%20common,IBM%20Watson%20and%20Microsoft%20Azure.

İşletim sistemi - Vikipedi

İşletim sistemi; bilgisayarda çalışan donanım kaynaklarını yöneten ve çeşitli uygulama yazılımları için yaygın…

tr.wikipedia.org

Hadoop vs. Spark vs. Kafka - How to Structure Modern Big Data Architecture? - nexocode

There is a lot of discussion in the big data world around Apache Kafka, Hadoop, and Spark. Which tool should you use…

nexocode.com

ML | Types of Learning - Supervised Learning - GeeksforGeeks

Let us discuss what is learning for a machine is as shown below media as follows: A machine is said to be learning from…

www.geeksforgeeks.org

ML | Types of Learning - Part 2 - GeeksforGeeks

Or unsupervised machine learning analyzes and clusters unlabeled datasets using machine learning algorithms. These…