Demystifying the ANN: A Journey Inside Artificial Neural Networks

Purvesh Kachhiya
3 min readJan 12, 2024

--

Artificial neural networks (ANNs) have become the backbone of modern AI, powering everything from image recognition to self-driving cars. But how do these complex systems work? This blog is a map to understanding the inner workings of the ANNs!

The Building Blocks: Neurons and Connections

Imagine a network of interconnected neurons, mimicking the human brain. Each neuron receives signals from its neighbors, processes them, and fires its signal if the input is strong enough. This is the basic unit of an ANN.

Neurons are connected by weights, determining the strength of the signal passed between them. These weights are like the knobs on a mixing board, adjusting the influence each neuron has on its neighbors.

The Spark of Activation: When Neurons Fire

A neuron’s decision to fire depends on an activation function. This function takes the weighted sum of its inputs and transforms it into an output signal.

Activation Functions:

Threshold Function:

  • Outputs a 1 if the input exceeds a threshold, 0 otherwise.
  • Simplest activation function, often used in early perceptron models.

Sigmoid Function:

  • S-shaped curve that maps inputs to values between 0 and 1.
  • Commonly used in early neural networks, but can suffer from vanishing gradients.

Rectifier Function (ReLU):

  • Outputs 0 for negative inputs, and the input itself for positive values.
  • It is widely used in modern deep neural networks due to its efficiency and ability to mitigate vanishing gradients.

Hyperbolic Tangent Function (tanh):

  • The S-shaped curve that maps inputs to values between -1 and 1.
  • Similar to sigmoid, it is sometimes preferred for its centered output.

The choice of activation function depends on the task and the network architecture.

Learning Through the Ages: Cost, Gradient, and Backprop

But how do these networks learn? It all starts with a cost function, which measures the difference between the network’s output and the desired outcome. Imagine this as the gap between your guess and the correct answer on a test.

The network then uses a technique called gradient descent to adjust its weights, minimizing the cost function. Think of it as inching closer to the correct answer by adjusting your knobs on the mixing board.

Gradient Descent: Optimization algorithm that iteratively adjusts model parameters (weights and biases) to minimize the cost function.

Key variants:

  • Batch Gradient Descent: Updates parameters after processing the entire dataset.
  • Stochastic Gradient Descent (SGD): Updates parameters after each individual training example, often faster and better for large datasets.

But how does the network know which weights to adjust and by how much? That’s where backpropagation comes in. This clever algorithm works its way backward through the network, calculating the “blame” each neuron shares for the error. Based on this blame, the weights are tweaked to collectively push the network towards better predictions.

From Pixels to Predictions: Putting it All Together

So, how does this all translate to real-world applications? Let’s take image recognition as an example. An image is fed into the input layer, pixel by pixel. Each layer extracts features from the image, like edges, shapes, and textures. The final layers then combine these features to identify the object in the image.

With enough training data and the right architecture, an ANN can learn to recognize cats, dogs, even faces with remarkable accuracy. This is just a glimpse into the vast potential of ANNs, which are constantly evolving and pushing the boundaries of what machines can do.

From Theory to Practice: Applications of ANNs

The beauty of ANNs lies in their versatility. They can be trained to tackle a wide range of tasks, including:

  • Image recognition: Identifying objects, faces, and even emotions in images.
  • Natural language processing: Understanding and generating human language, powering chatbots and machine translation.
  • Fraud detection: Identifying suspicious patterns in financial transactions.
  • Medical diagnosis: Analyzing medical data to assist in disease diagnosis and treatment.

Conclusion

ANNs leverage interconnected artificial neurons, employing activation functions to process weighted signals. Gradient descent and backpropagation optimize network parameters by minimizing cost functions, enabling accurate pattern recognition and prediction across diverse domains.

--

--