Basic Long Short-Term Memory (LSTM)

5 min readFeb 7, 2024

Long Short-Term Memory (LSTM) is a type of recurrent neural network (RNN) architecture designed to address the limitations of traditional RNNs in capturing long-range dependencies in sequential data. In this article, we’ll explore the basics of LSTM networks, their functions, and real-life applications suitable for beginners and intermediate learners.

What is LSTM?

LSTM is a specific type of neural network architecture that falls under the broader category of recurrent neural networks (RNNs). Unlike traditional RNNs, LSTMs are explicitly designed to overcome the vanishing gradient problem, allowing them to effectively capture and remember long-term dependencies in sequential data.

The primary advantage of LSTMs lies in their ability to maintain a memory cell, which can store and retrieve information over extended periods, making them well-suited for tasks involving time-series data, natural language processing, and more.

Functions of LSTM with Basic Formulas

LSTM networks consist of several key components, including input gates, forget gates, output gates, and a memory cell. Here’s a brief overview of their functions:

Input Gate (i_t):

The input gate controls the information that enters the memory cell. It utilizes the sigmoid activation function to determine which values to update.

Forget Gate (f_t):

The forget gate decides which information to discard from the memory cell. It leverages the sigmoid activation function to determine the retention of past information.

Memory Cell (C_t):

The memory cell stores and updates information over time, incorporating inputs from the input gate and outputs from the forget gate.

Output Gate (o_t):

The output gate determines the final output of the LSTM, considering the updated memory cell.

Hidden State (h_t):

The hidden state is the output of the LSTM at each time step and is computed using the memory cell and the output gate.

Where Can We Use LSTM in Real Life?

LSTM networks find applications in various real-life scenarios due to their ability to model and predict sequential patterns. Some common applications include:

Time Series Prediction: LSTMs excel in forecasting future values in time-series data, such as stock prices, weather conditions, and energy consumption.

Natural Language Processing (NLP): LSTMs are widely used in NLP tasks, including language translation, sentiment analysis, and text generation, as they effectively capture contextual dependencies in text.

Speech Recognition: LSTMs play a crucial role in speech recognition systems, enabling accurate transcription and understanding of spoken language.

Healthcare: In healthcare, LSTMs are employed for tasks like predicting patient outcomes, disease progression modeling, and analyzing electronic health records.

Gesture Recognition: LSTMs can be used to recognize and interpret gestures in applications like human-computer interaction and sign language recognition.

Code Live 👉 Link

Imports

import yfinance as yf
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense

Getting Historical Stock Data

stock_symbol = 'TSLA'

stock_data = yf.download(stock_symbol,  period="max")
stock_data.head(10)

closing_prices = stock_data['Close'].values.reshape(-1, 1)
scaler = MinMaxScaler(feature_range=(0, 1))
closing_prices_scaled = scaler.fit_transform(closing_prices)

LSTM Model Creation and Training

# Define the prepare_data function
def prepare_data(data, n_steps):
    x, y = [], []
    for i in range(len(data) - n_steps):
        x.append(data[i:(i + n_steps), 0])
        y.append(data[i + n_steps, 0])
    return np.array(x), np.array(y)

def create_lstm_model(input_shape):
    """
    Create and compile an LSTM model for time series prediction.

    Parameters:
    - input_shape (tuple): Shape of the input data in the form (time_steps, features).

    Returns:
    - model (Sequential): Compiled LSTM model.
    """
    model = Sequential()
    # Add the first LSTM layer with 50 units and return sequences for the next layer
    model.add(LSTM(units=50, return_sequences=True, input_shape=input_shape))
    # Add the second LSTM layer with 50 units
    model.add(LSTM(units=50))
    # Add a Dense layer with 1 unit for regression
    model.add(Dense(units=1))
    
    # Compile the model using the Adam optimizer and Mean Squared Error loss
    model.compile(optimizer='adam', loss='mean_squared_error')
    
    return model


# Code snippet for creating and training the LSTM model
n_steps = 60

# Prepare the training data using the defined function
x_train, y_train = prepare_data(closing_prices_scaled, n_steps)

# Reshape the input data to fit the LSTM model
x_train = np.reshape(x_train, (x_train.shape[0], x_train.shape[1], 1))

# Create an instance of the LSTM model
model = create_lstm_model((x_train.shape[1], 1))

# Train the model on the training data
model.fit(x_train, y_train, epochs=10, batch_size=32)

Making Predictions and Evaluation

# Code snippet for making predictions and evaluation
train_predictions = model.predict(x_train)
train_predictions = scaler.inverse_transform(train_predictions)
mse = mean_squared_error(closing_prices[n_steps:], train_predictions)
print(f'Mean Squared Error on Training Data: {mse}')

import matplotlib.pyplot as plt
plt.figure(figsize=(12, 6))
plt.plot(stock_data.index[n_steps:], closing_prices[n_steps:], label='Actual Prices', color='blue')
plt.plot(stock_data.index[n_steps:], train_predictions, label='Predicted Prices', color='red')
plt.title(f'{stock_symbol} Stock Price Prediction using LSTM')
plt.xlabel('Date')
plt.ylabel('Stock Price (USD)')
plt.legend()
plt.show()

Long Short-Term Memory networks offer a powerful solution for handling sequential data by addressing the challenges posed by vanishing gradients in traditional RNNs. By understanding the basic functions and applications of LSTMs, beginners and intermediate learners can leverage this technology for a wide range of real-life tasks, contributing to advancements in various fields.

Basic Long Short-Term Memory (LSTM)

Written by Armando Aguilar L.