Demystifying Deep Learning: A Guide to Understanding Neural Networks

Josh Anderson
7 min readMar 7, 2023

--

A discussion of the basics about deep learning and its applications.

Deep learning is a branch of machine learning that allows algorithms to understand complex relationships within data. It has numerous applications today, including medical diagnosis, speech recognition, computer vision, self-driving cars, weather forecasting, and data mining.

In this deep learning guide, I’ll be discussing artificial neural networks, and how they are used in deep learning. While it is a vast subject with a lot of depth, I will do my best to cover all the basic fundamentals of deep learning in a simple manner. That said, understanding deep learning is a journey that never really ends!

What is deep learning?

Deep learning is a subset of machine learning that uses algorithms to create a statistical model based on input that has gone through multiple iterations and nonlinear transformations. For example in classification, the model is able to draw complex curves to separate categories of data. Here are examples of what a few nonlinear classification problems may look like:

Examples of classifying 2D data into blue and red nonlinear groups

How does deep learning work?

Deep learning is driven by a specific machine learning model called a neural network. It is inspired by the way neurons connect within the human brain. Human neurons have a nucleus that are the core. These take inputs from other neurons and output some activation or lack of activation. The network of all the neural connections enable the brain to understand complex ideas and thoughts like language, sight, and problem solving. They look something like this:

In deep learning, neural networks are made up of artificial neurons connected via artificial neural networks (ANNs). Each artificial neuron takes inputs from other artificial neurons and produces an output, just like a real neuron in the human brain. These artificial neural networks become increasingly complex when connections are refined with the more data it sees, just like how a human brain becomes more sophisticated with age and experience. This is why deep learning provides a creative solution for machine learning problems that are difficult to solve with traditional algorithms. They are composed of inputs (i.e. the data) which are fed into at least one, but usually many hidden layers that can learn those complexities. The final hidden layer has values fed into an output layer that can range from probabilities of a class to a predicted value of the outcome.

Basic ANN architecture

Neural networks involve a trial-and-error process that consists of training the network using input data and then testing it with new data to see how it performs. ANNs process training data by looking at a single sample or groups of samples from the training data. This step is repeated over multiple epochs or “iterations.” Epochs are what refine the network to learn the patterns in the data through many iterations.

Training Deep Learning Networks

Deep learning networks can be trained using a variety of techniques, such as backpropagation. During training, the network adjusts weights of each neuron to reduce errors between its predicted output and the ground truth label. These weights allows for the activation of some artificial neurons to be more impactful on the output. These adjustments are what let the network model many complex relationships. Deep learning networks are typically trained on large data sets and require a significant amount of computing power and memory to work effectively.

A relatively simple algorithm for optimizing the training process is known as known as gradient descent. This algorithm minimizes the network’s error by using fundamental methods from calculus like partial derivatives to find the minimum error. It a common method for adjusting the network’s weights through a series of small adjustments.

Example optimization of error in an ANN using stochastic gradient descent

How does deep learning attain such impressive results?

ANNs can be designed in any shape or size. This flexibility allows engineers to cater the architecture of the network for their specific problem. As the networks get larger, there is more they can learn, but the more data they will need. Because the rate of data collection has skyrocketed in the 21st century, ANNs have become more powerful than ever. Areas that may have been sparse in data before have increasing databases to train ANNs. They also allow for investigation of problems without too much much prior knowledge of the nature of these problems. Many machine learning algorithms are restricted to certain types of relationships in data for their use. ANNs are treated as a wild card that can learn almost any type of relationship.

Applications of deep learning

Deep learning can be used for image classification, natural language processing, real-time image/video interpretation, general artificial intelligence, etc. In image classification, deep learning methods are used to detect and classify objects in images. In the real world, this has been used for applications like security systems, garbage sorting, electronic check deposits, and even self driving cars:

In natural language processing, deep learning methods are used to analyze text data with the help of neural networks. Real world examples include grammar assistance like Grammarly or auto-correct, chatbots like ChatGPT, search engines, and translators. It is also a key part of personalized advertising:

These are also used in medicine in a variety of ways. An interesting example concerns medical data. Data in medicine is sparse and expensive so deep learning is used to try to generate fake data that contains new information that could be plausible in the real world.

Limitations of Deep Learning

The main bottlenecks of these networks are the computational resources and amount of data required. As networks and datasets get larger, faster and more powerful computers are needed. In research applications, ANNs can take weeks or even months to train even using powerful GPUs. If there was unlimited computing power, ANNs could learn almost any pattern there is to learn. There are a few other drawbacks to note:

  • Lack of explainability: Deep learning models can be difficult to interpret and understand due to their complex structure and large number of parameters. This makes it hard to diagnose and correct errors, or to explain how the model arrived at its predictions.
  • Vulnerability to adversarial attacks: Deep learning models are susceptible to adversarial attacks, where malicious actors manipulate inputs to cause the model to misbehave. This is particularly problematic in security-critical applications, such as self-driving cars or medical systems.
  • Limited generalization: Deep learning models may struggle to generalize well to new domains, particularly if the training data is significantly different from the testing data. Models can also drift, meaning relationships the model learned may be outdated or changed over time.
  • Lack of causal understanding: Many questions investigated with machine learning can involve investigating causality. Deep learning models are primarily used for prediction and classification tasks, but they do not provide a causal understanding of the relationships between input and output variables. This can limit their usefulness in applications where understanding cause and effect is important, especially in areas like medicine.

Conclusion

Deep learning is a field of artificial intelligence that allows computers to learn data in a human-like fashion. It has been making headlines for its ability to drastically improve computer processing, processing, and decision-making abilities. As artificial intelligence continues to disrupt the way we live and work, deep learning is one of the most important types of algorithms that are driving this change. While it seems complicated at first glance, their development and deployment can actually be a simple process. ANNs continue to grow in their influence, and will likely be around for a long time.

Joshua Anderson is currently working towards a Ph.D. in Intelligent Systems at the University of Pittsburgh researching fairness in medical AI models.

Liked what you read?

Click here to see my other articles: https://medium.com/@talkai

Disclosure: some of this article was written with the help of AI-assistive technology

--

--

Josh Anderson

I am a Ph.D. student at University of Pittsburgh. I research fairness in AI and applications of AI in medicine.