A Non-Technical Introduction to AI: Part 1
A Gentle Introduction to everything you’ve wanted to ask an AI or Data Science person, but were too afraid to ask.
How do you…do that thing that you do?
Have you ever wondered how ChatGPT actually works under the hood? Have you asked yourself if you have to be a programmer in order to understand how these “AI” tools work? Are you worried you don’t understand enough about machines and are scared of where the world is headed through Automation? Then, strap into your seats, this post is for you!
As Artificial Intelligence (AI) and Machine Learning (Deep Learning especially) have suddenly surfaced more than ever onto front-stage of public perception with tools like ChatGPT, DALLE, and more, there is a whole host of new people invested in learning more about how these tools actually work. This will be the first part in a series of breaking down AI and Machine Learning/Deep Learning topics in a relatable, non-technical way for those of you who may not have a technical (that is to say, programming) background.
Deep learning is a subfield of Artificial Intelligence (AI), which has garnered significant attention in recent years, owing to its remarkable success in solving complex problems that were once thought to be impossible for computers. At the core of deep learning are neural networks, which are designed to mimic the human brain’s structure and function. In this blog post, we will take a gentle approach to understanding the fundamentals of deep learning and its applications in various domains.
In order to understand Deep Learning, we have to understand that Deep Learning is also a subfield of Machine Learning. Let’s cover a brief introduction to the basics of machine learning before we dive deeper into Neural Networks (the basic component of deep learning).
Machine Learning 101
Machine learning (ML) is revolutionizing the way we live, work, and play. It seems like a magic wand that can turn once-impossible tasks into easy everyday activities… but how does it work? What are the secret ingredients that make this enchanting technology possible?
Imagine a world where computers can learn from experience, just like humans. In this world, our digital companions can analyze data, recognize patterns, and make decisions — all without any explicit programming! This may sound like something out of a futuristic novel, but it’s the world we live in today, thanks to machine learning.
At its core, machine learning is all about creating algorithms that can learn from data and adapt over time. Instead of being programmed to follow a set of instructions, ML algorithms evolve by analyzing vast amounts of information and extracting meaningful insights. It’s like having a digital detective that uncovers hidden clues and solves complex problems.
The Magical Recipe: Data + Algorithms = Machine Learning
The secret sauce that makes machine learning possible is the combination of data and algorithms. Data is the raw material that fuels the learning process, while algorithms are the set of rules and calculations that guide the machine’s analysis and decision-making.
1. Data: The Lifeblood of Machine Learning
Machine learning thrives on data, much like a gardener needs soil, sunlight, and water to grow plants. The more data the algorithm has to work with, the better it can learn and improve its performance. In the world of ML, data comes in various forms, such as images, text, numbers, and more. Nowadays our world is inundated on all sides by enormous amounts of data. Most people and organizations have more data collected than they know what to do with! This is another driving reason behind the booming economy of Machine Learning since there is a high “demand” for actionable market insights into what we can do with this high “supply” of data.
2. Algorithms: The Brains Behind the Operation
Algorithms are the secret recipes that guide the machine’s learning process. Algorithms take in the data as their input and in the traditional programming paradigm (of “computing”) to create an output. Machine Learning is a newer paradigm that involves taking input data to make predictions. There are many different types of ML algorithms, each with its unique approach to solving problems. Machine learning can be broadly categorized into three types: supervised learning, unsupervised learning, and reinforcement learning.
- Supervised Learning: The algorithm learns from a labeled dataset (e.g. a bunch of fruits that come with a label of “apple” or “not apple). Then the algorithm uses the given input-output pairs and tries to find a relationship between them. The large majority of business operations in the corporate sector generally rely on Supervised Learning, because wherever there is a spreadsheet, there’s a good bet you can use Supervised learning there. All types of tasks such as classifying fraudulent transactions or loyal customers, forecasting future sales based on sales history, and many more! See the image below for all the different things, but stay tuned to how the main two types of Supervised learning tend to be either classification or regression tasks.
- Unsupervised Learning: The algorithm learns from an unlabeled dataset, like an explorer discovering hidden patterns in unknown territory. Think of Lewis & Clark charting a path through the “unknown” continental United States, but instead the “terrain” is data, and the explorers are ML algorithms that are investigating the underlying structure or relationships within any data set you provide. Popular examples of Unsupervised learning include the Netflix recommendation system you use to find new movies, Amazon’s product recommender system, Spotify music recommendation, and more!
- Reinforcement Learning: The algorithm learns by interacting with an environment and receiving feedback in the form of rewards or penalties. It’s like training a dog to perform tricks using treats and praise (or whatever the opposite of that is, like “bad dog!”). Since Reinforcement learning is still a new topic, think of this field as continually evolving. One way you can see Reinforcement Learning is in “Gamification” applications where there is a type of reward for certain behaviors such as in Robot skill acquisition or navigation. You can also use Reinforcement learning to teach machines to play games like Chess or Go, which have been done in recent decades to beat the “best of the best.” This is still a field that is being developed in research labs but will be hitting the streets in a big way soon.
Supervised Learning — The Bread and Butter of the Industry
Although we see many fancy applications of “deep learning” in recent news and the advances of AI, the main bread and butter of the large majority of current industry applications generally falls into one of the general uses of supervised learning and unsupervised learning. We’ll now go into a bit more depth here about the two and how you can have a working understanding of these two major modalities of Machine Learning. Think of it as the foundation stone or “building blocks” of every other type of AI you may see. They are the first type of tools that Data Scientists and Machine Learning Engineers learn in coursework and curricula and they should not be ignored for anyone new exploring this space.
Imagine you are being taken on a guided tour through an orchard in order to learn how to recognize different types of fruits by taking a guided tour through an orchard. Your guide points out various fruits and tells you their names — this is an apple, that is an orange, and so on. After seeing several examples of each fruit, you start to recognize patterns and can eventually identify the fruits on your own.
This is akin to supervised learning, where a machine learning model is trained using labeled data (the guide in our analogy). The model receives input data (the fruits) and is provided with the correct output (the fruit names). By repeatedly exposing the model to labeled examples, it learns to identify patterns and make accurate predictions when presented with new, unlabeled data. This example we just shared is an example of Machine Learning Classification, which is one key sub-branch of Supervised Learning. In the image above, we are classifying between apples and strawberries in our supervised learning algorithm.
Note: Keep in mind that these data may not (and likely are not) images, but more like structured data in a table format (spreadsheets) that expresses some pattern. Some quick example models of classification are: predict customer churn (Yes or No), predict return customers (“likely to return” or “unlikely to return”), fraudulent transactions (“fraud” or “not fraud”), and many more! The more you see data with an understanding of different ML models, the more easily you’ll be able to understand which type of model fits which dataset.
Supervised Learning: Regression
Sometimes the gatekeepers of the ML and AI world will unintentionally use terms that the rest of the world does not understand, so we’ll do our best here to ‘translate’ some of those terms so that the core info really sticks. One example of this is when technical people mention “features” or “targets.” When a Data Scientist or Machine Learning Engineer talks to you about “features,” what they are really saying in plain English, are the various types (or columns) of input data that were used to construct the model. In a general sense, you can think of these features as the characteristics that were used to help train the Machine Learning model.
“Target” variables are the result or output that the Machine Learning model is typically trying to “predict.” This is a really important key to understanding, which we will talk more about soon when we think about how to choose the right model. Let’s walk through an example to help paint the picture more clearly.
Imagine you are trying to predict the price of a house based on its characteristics (or features), such as square footage, the number of bedrooms, and location. You have access to a dataset containing historical sales data, which you use to determine the relationships between the features and the sale prices. The historical sales data usually contains the price that the house was sold for (our target) and our features, such as square footage, # of bedrooms, # of bathrooms, etc. The goal of the model will be to figure out how these features correspond to the price level of different houses.
If you have ever been “house shopping” you know that there are certain factors like the image above that have a big impact on different house prices. The idea is that in a certain neighborhood, having an extra bedroom or bathroom may result in an additional amount in purchase price ($∆y). You probably guessed that the bigger the house, the “nicer” the neighborhood, the more expensive it is! The goal of our ML model is to figure out (to the best of its ability) through a “Linear Regression” model the linear correlation of each of these feature variables to the target (house price).
This is a quick overview of a supervised learning regression problem, where a model is trained to predict a target variable (the house price) based on feature data (the house characteristics). The model learns from labeled examples (existing data for house prices), allowing it to make accurate predictions for new, unlabeled data. If you are interested in seeing a more detailed technical example (with code) about how to create a linear regression model for housing data, you can check out my other article: Predicting House Prices Using Multi-parametric Linear Regression.
Some other applications of regression include stock price prediction, sales forecasting, and determining product pricing.
Classification or Regression? How to Decide
When faced with a machine learning problem, it’s essential to choose the right method to tackle the challenge at hand. For supervised learning tasks, two popular techniques are classification and regression. But how do you decide between the two? Let’s break it down using non-technical language and real-life examples.
When deciding between classification and regression, consider the nature of the problem and the type of output (target variable) you need to predict. If the goal is to assign input data to distinct, well-defined categories (or binary values), classification is likely the right choice. If you need to predict a continuous numerical value (housing prices, stock value, sales forecast), regression is more suitable.
It is important to consider the data you have available. For classification tasks, you’ll need labeled examples with the correct category assignments to train the model. For regression tasks, you’ll need examples with continuous numerical values as the target outputs.
A categorical variable is one that represents distinct categories or groups. It can be either qualitative (descriptive) or quantitative (numeric) but is generally treated as non-numeric in analysis. Categorical variables are typically used for classification tasks in machine learning.
Examples of categorical variables:
- Colors (red, blue, green)
- Genders (male, female, other)
- Sizes (small, medium, large)
- Product categories (electronics, clothing, food)
- Customer churn (Yes or No / 1 or 0)
- Fraudulent Transactions (Fraud or Not Fraud)
A continuous variable is a numeric variable that can take any value within a specified range or interval. It represents measurements or quantities and typically has an infinite number of possible values. Continuous variables are used for regression tasks in machine learning.
Examples of continuous variables include:
- Age
- Height
- Weight
- Temperature
- Price
- Sales
The main thing to consider is whether the target variable is continuous or categorical, as this will determine the quality of what you are trying to predict and therefore the best model to predict it.
By understanding the differences between classification and regression and considering the problem’s characteristics, you can confidently choose the appropriate supervised learning method for your specific use case.
Unsupervised Learning: Clustering
Imagine you are a city planner tasked with organizing neighborhoods based on the types of businesses present in each area. You do not have any information about which businesses belong to which neighborhood category. Instead, you analyze the data and notice patterns, such as a high concentration of coffee shops in one area and a cluster of tech companies in another. Based on these patterns, you group the businesses into neighborhoods.
This is similar to unsupervised learning clustering, where a model is trained to identify patterns and group input data into clusters based on similarity, without relying on labeled examples. The model discovers underlying structures in the data, allowing it to cluster new, unlabeled data accurately. Common applications of clustering include customer segmentation, anomaly detection, and social network analysis.
Unsupervised Learning: Dimensionality Reduction
Now, consider you are a marketing analyst trying to understand customer behavior. You have access to a large dataset containing numerous variables, such as age, income, location, and purchase history. The challenging thing here is that each data type is actually another dimension in your feature set since we have to find some sort of connection between all these dimensions at once.
Analyzing this data can be overwhelming due to its high dimensionality, so you aim to reduce the number of variables while preserving the essential information. Therefore one useful unsupervised learning application is to compress the complex dimensionality of our dataset to look at the “principal components” in our data: this is known as principal component analysis, which seeks to “flatten” the multi-dimensionality of the data into a more digestible form. The goals of this dimensionality reduction are to reduce the computational complexity while also improving the separability of the different inherent “classes” (or groups) in the data. Note that this is different than the other unsupervised learning technique we learned where we are grouping “clusters” of the existing data on the same subspace.
PCA and Clustering can also be combined in other advanced unsupervised learning analyses where we are both reducing the dimensions of the data and finding the clusters of the data within these “subspaces.” This process makes the data easier to analyze and visualize, without losing significant information. Applications of dimensionality reduction include data compression, feature extraction, and data visualization.
Real-Life Applications: Machine Learning in Action
Machine learning is reshaping our world, with applications ranging from personalized recommendations to autonomous vehicles. Here are some examples of how ML is making our lives more comfortable and enjoyable. See if you can guess which type of Machine Learning category each of these applications falls into:
- Personalized Recommendations: Online platforms like Netflix and Amazon use ML algorithms to analyze user behavior and preferences, offering personalized suggestions for movies, products, and more. It’s like having a digital concierge that knows your tastes and desires.
- Healthcare: Machine learning is revolutionizing healthcare by improving diagnostics, predicting disease outbreaks, and optimizing treatment plans. It’s like having a team of digital doctors working tirelessly to keep us healthy and happy.
- Autonomous Vehicles: Self-driving cars use ML algorithms to process data from sensors, cameras, and radars to navigate and make decisions in real time. It’s like having a digital chauffeur that takes you from point A to point B safely and efficiently.
- Fraud Detection: Financial institutions use ML algorithms to analyze transactions and detect suspicious activities. It’s like having a digital detective that protects your hard-earned money from thieves and scammers.
- Language Translation: Machine learning powers language translation apps like Google Translate, enabling people to communicate across linguistic barriers. It’s like having a digital interpreter that helps you understand and connect with people from different cultures.
- Natural Language Processing: ML algorithms can analyze and understand human language, enabling applications like chatbots, virtual assistants, and sentiment analysis. It’s like having a digital linguist that deciphers the complexities of human communication.
- Image Recognition: Machine learning can recognize and classify objects within images, which is useful for applications like facial recognition, medical imaging, and autonomous navigation. It’s like having a digital artist that can analyze and interpret the visual world.
Tying it all Together: Embrace the Wonders of Machine Learning
Machine learning is an exciting and transformative technology that’s changing the way we live, work, and play. By understanding the basics of ML, you can better appreciate its potential and join the conversation about its future.
As you can see, machine learning isn’t as daunting as it might have seemed at first. With a little bit of whimsy, a pinch of humor, and a heaping spoonful of friendly analogies, we’ve managed to unravel the mysteries of ML and explore its fascinating applications. So, step into the world of AI with confidence and curiosity, and let the wonders of machine learning inspire you! In the next part of this series, we’ll go even “deeper” into the world of Deep Learning in order to “demystify” the nebulous and esoteric concepts of neural networks and reduce the technical jargon to plain English, so stay tuned!
Author’s Footnote: Portions of this article were supplemented by Generative AI tools in its writing. This was used for outline mapping of common concepts and for certain selected definitions. The Author accepts sole responsibility for the content expressed in this writing.