AI Masterclass Tutor — Lesson 1

Shahani Markus
3 min readNov 13, 2023

I will keep my notes and comments italicized to distinguish my commentary from my AI Tutor’s text. This is the 2nd of a sequence of 3 posts — links to the first and the third.

My Personal (AI) Tutor decided to directly dive into Lesson 1, “An Introduction to Advanced Machine Learning Algorithms”, focusing on areas that leverage my supposedly strong background in computer science and mathematics. Yikes!

The following was the content of my first lesson!

Deep Learning Overview

1. Neural Networks Fundamentals

Basic Structure: A neural network consists of layers of neurons. Each neuron in one layer connects to neurons in the next layer through weights. The first layer is the input layer, followed by one or more hidden layers, and finally an output layer.

Activation Functions: These functions determine the output of a neuron. Common examples include:

  • ReLU (Rectified Linear Unit): Outputs the input if it’s positive; otherwise, it outputs zero.
  • Sigmoid: Transforms the input into a range between 0 and 1, useful for binary classification.
  • Tanh (Hyperbolic Tangent): Similar to sigmoid but transforms input into a range between -1 and 1.

2. Convolutional Neural Networks (CNNs)

Purpose: Primarily used for image processing and computer vision tasks.

Key Layers:

  • Convolutional Layer: Applies a series of filters to the input to create feature maps.
  • Pooling Layer: Reduces the spatial size of the feature maps, reducing the number of parameters and computation in the network.
  • Feature Extraction: CNNs automatically detect important features without any human supervision.

Here’s a visual representation of how CNNs work:

3. Recurrent Neural Networks (RNNs)

Purpose: Designed for sequential data (e.g., time series, sentences).

Key Concept: Has “memory” of previous inputs in the sequence, allowing it to maintain context.

Challenges:

  • Vanishing Gradient Problem: In long sequences, the gradient can become very small, causing the network to stop learning.
  • Exploding Gradient Problem: Conversely, the gradient can become excessively large, leading to unstable learning.

Solutions:

  • LSTM (Long Short-Term Memory): Addresses the vanishing gradient problem by introducing gates that regulate the flow of information.
  • GRU (Gated Recurrent Units): A simpler variant of LSTM with fewer parameters.

Additional Considerations

  • Backpropagation: This is how neural networks learn. It’s a method of adjusting the weights of the network by calculating the gradient of the loss function.
  • Overfitting and Regularization: Techniques like dropout (randomly dropping out neurons during training) and L2 regularization are used to prevent overfitting, which occurs when a model learns the training data too well and performs poorly on new, unseen data.

After reading through I felt this was not the way I wanted to learn any complex topic. I told my AI Tutor to switch to more of a story-telling mode to make the complex topics more appealing and easier to understand.

What happened? It was quite hilarious as detailed in my next post.

--

--

Shahani Markus

I've worn multiple hats yielding an eclectic blend of skills spanning entrepreneurship, product development, tech strategy, and academia.