Developing a Streamlit web application to analyze sentiment towards Covid vaccines based on tweets.

Alidu Abubakari

Published in

AI Science

11 min readMay 14, 2023

https://huggingface.co/spaces/Abubakari/Covid_Vaccines_Tweet_Sentiment_Analysis

Checkout the github repo: https://github.com/aliduabubakari/Covid_vaccine-tweet-analytics-app.git

Introduction to Sentiment Analysis

Sentiment analysis, also known as opinion mining, is a Natural Language Processing (NLP) technique that involves extracting subjective information from text to determine the writer’s attitude towards a particular topic or product. Sentiment analysis is a powerful tool that can be used to analyze customer feedback, social media posts, news articles, and other types of text data to gain insights into public opinion and behavior.

The sentiment analysis NLP app developed by Team Harmony is a great example of how sentiment analysis can be used to analyze tweets related to Covid-19. The app uses a pre-trained model to predict the sentiment of the input text, which can be positive, negative, or neutral. The app is a part of a project to promote teamwork and collaboration among developers and is a great tool for analyzing public sentiment about Covid-19.

Understanding the Python Libraries Used

The sentiment analysis NLP app is built using several Python libraries that are commonly used in NLP tasks. Here is a brief overview of the libraries used in the code:

Pandas: Pandas is a popular Python library for data manipulation and analysis. It is used in the app to create dataframes to store the results of the sentiment analysis.
NumPy: NumPy is a library for numerical computing in Python. It is used in the app for numerical calculations.
Streamlit: Streamlit is a Python library for building interactive web applications. It is used in the app to create the user interface and handle user input.
Altair: Altair is a Python library for creating interactive visualizations. It is used in the app to create a bar chart of the sentiment scores.
Transformers: Transformers is a library built by Hugging Face that provides state-of-the-art NLP models and tools. It is used in the app to load pre-trained models and perform sentiment analysis.
PIL: PIL (Python Imaging Library) is a library for working with images in Python. It is used in the app to load and display images.
Base64: Base64 is a library for encoding and decoding data in base64 format. It is used in the app to encode the cover image and display it in the user interface.

import pandas as pd
import numpy as np
import streamlit as st
import altair as alt
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from PIL import Image
import base64

Here is an example of how the NumPy library is used in the code to compute the confidence level of the sentiment analysis:

# Compute the confidence level
confidence_level = np.max(outputs.logits.detach().numpy())

Loading and Using the Pre-trained Models

The sentiment analysis NLP app uses three pre-trained transformer models, namely ROBERTA, BERT, and DISTILBERT, which are all part of the Hugging Face Transformers library. These models are designed to perform sequence classification tasks, where the input is a sequence of tokens, and the output is a class label or a probability distribution over class labels.

To load the pre-trained models, the app uses the AutoTokenizer and AutoModelForSequenceClassification classes from the Transformers library. The AutoTokenizer class automatically selects the appropriate tokenizer based on the model's name, while the AutoModelForSequenceClassification class automatically loads the appropriate model architecture and weights based on the model's name.

Here is an example of how the ROBERTA model is loaded and used in the app:

# Define the available models
models = {
    "ROBERTA": "Abubakari/finetuned-Sentiment-classfication-ROBERTA-model",
    "BERT": "Abubakari/finetuned-Sentiment-classfication-BERT-model",
    "DISTILBERT": "Abubakari/finetuned-Sentiment-classfication-DISTILBERT-model"
}

# Select the model to use
model_name = "ROBERTA"

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(models[model_name])

# Load the model
model = AutoModelForSequenceClassification.from_pretrained(models[model_name])

# Tokenize the input text
inputs = tokenizer("This is an example sentence.", return_tensors="pt")

# Make a forward pass through the model
outputs = model(**inputs)

# Get the predicted class and associated score
predicted_class = outputs.logits.argmax().item()
score = outputs.logits.softmax(dim=1)[0][predicted_class].item()

# Print the predicted class and associated score
print(f"Predicted class: {predicted_class}, Score: {score:.3f}")

In this example, the ROBERTA model is loaded using its name from the models dictionary. The tokenizer and model are loaded using the AutoTokenizer.from_pretrained() and AutoModelForSequenceClassification.from_pretrained() methods, respectively. The input text is tokenized using the tokenizer, and a forward pass is made through the model using the model() method. The predicted class and score are obtained from the model's output logits, and they are printed to the console.

Defining the Home Page and About Page

Here are some details about the script and code snippets to illustrate the different parts:

The “How to Use” message: This is a Markdown message that explains how to use the app. It is defined as a string and rendered using the markdown function of the Streamlit module.

# Define the "How to Use" message
how_to_use = """
**How to Use**
1. Select a model from the dropdown menu
2. Enter text in the text area
3. Click the 'Analyze' button to get the predicted sentiment of the text
"""

# Add the "How to Use" message to the sidebar
st.sidebar.markdown(how_to_use)

2. The “Home” page: This page contains the form that allows the user to enter text and select a model for sentiment analysis. When the user submits the form, the app performs sentiment analysis and displays the results.

if choice == "Home":
    st.subheader("Home")

    # Add a dropdown menu to select the model
    model_name = st.selectbox("Select a model", list(models.keys()))

    with st.form(key="nlpForm"):
        raw_text = st.text_area("Enter Text Here")
        submit_button = st.form_submit_button(label="Analyze")

    # Layout
    col1, col2 = st.columns(2)
    if submit_button:
        # Display balloons
        st.balloons()

        # Perform sentiment analysis
        # ...

        # Display results
        # ...

3. The “About” page: This page contains information about the app and the team that developed it.

else:
    st.subheader("About")
    st.write("This is a sentiment analysis NLP app developed by Team Harmony for analyzing tweets related to Covid-19. It uses a pre-trained model to predict the sentiment of the input text. The app is part of a project to promote teamwork and collaboration among developers.")

The entire script imports necessary modules, defines some helper functions, and creates a Streamlit app with a title, a subtitle, and a sidebar. The app’s main content is displayed based on the user’s selection from the sidebar menu. When the user selects “Home,” the sentiment analysis form is displayed, and when the user selects “About,” information about the app is displayed.

Designing the User Interface with Streamlit

One important aspect of designing a user interface with Streamlit is the use of widgets. Widgets allow users to interact with the application and provide input for the program. The code above uses several widgets, including st.selectbox(), st.text_area(), and st.form_submit_button().

For example, the code uses the st.selectbox() widget to create a dropdown menu for selecting a model. Here is the relevant code snippet:

models = {
    "ROBERTA": "Abubakari/finetuned-Sentiment-classfication-ROBERTA-model",
    "BERT": "Abubakari/finetuned-Sentiment-classfication-BERT-model",
    "DISTILBERT": "Abubakari/finetuned-Sentiment-classfication-DISTILBERT-model"
}

model_name = st.selectbox("Select a model", list(models.keys()))

The code defines a dictionary of model names and their corresponding pre-trained models. The st.selectbox() function creates a dropdown menu with the model names as options, and stores the selected model name in the model_name variable.

The code also uses the st.text_area() widget to create a text area where users can input text for sentiment analysis. Here is the relevant code snippet:

with st.form(key="nlpForm"):
    raw_text = st.text_area("Enter Text Here")
    submit_button = st.form_submit_button(label="Analyze")

The with st.form() block defines a form with a key of "nlpForm". Inside the form, the st.text_area() function creates a text area where users can input text, and stores the input text in the raw_text variable. The st.form_submit_button() function creates a button with the label "Analyze", and stores whether the button was clicked in the submit_button variable.

Finally, the code uses the st.image() widget to display images in the application. Here is the relevant code snippet:

st.image("Cover_image.jpg")

The st.image() function displays the image with the filename "Cover_image.jpg".

These are just a few examples of the many widgets available in Streamlit. By using widgets effectively, you can create a user-friendly and interactive interface for your application.

Analyzing User Input Text with the Pre-trained Models

The next step in the NLP app is to analyze the user input text using the pre-trained models. In the Streamlit app, the user can select one of three available models: ROBERTA, BERT, or DISTILBERT.

After the user enters the input text and clicks on the “Analyze” button, the app tokenizes the input text using the selected pre-trained model. Then, it makes a forward pass through the model to get the predicted class and associated score for the input text.

Here is the relevant code from the main() function:

if submit_button:
    # Display balloons
    st.balloons()
    with col1:
        st.info("Results")
        tokenizer = AutoTokenizer.from_pretrained(models[model_name])
        model = AutoModelForSequenceClassification.from_pretrained(models[model_name])

        # Tokenize the input text
        inputs = tokenizer(raw_text, return_tensors="pt")

        # Make a forward pass through the model
        outputs = model(**inputs)

        # Get the predicted class and associated score
        predicted_class = outputs.logits.argmax().item()
        score = outputs.logits.softmax(dim=1)[0][predicted_class].item()

        # Compute the scores for all sentiments
        positive_score = outputs.logits.softmax(dim=1)[0][2].item()
        negative_score = outputs.logits.softmax(dim=1)[0][0].item()
        neutral_score = outputs.logits.softmax(dim=1)[0][1].item()

        # Compute the confidence level
        confidence_level = np.max(outputs.logits.detach().numpy())

        # Print the predicted class and associated score
        st.write(f"Predicted class: {predicted_class}, Score: {score:.3f}, Confidence Level: {confidence_level:.2f}")

        # Emoji
        if predicted_class == 2:
            st.markdown("Sentiment: Positive :smiley:")
            st.image("Positive_sentiment.jpg")
        elif predicted_class == 1:
            st.markdown("Sentiment: Neutral :😐:")
            st.image("Neutral_sentiment.jpg")
        else:
            st.markdown("Sentiment: Negative :angry:")
            st.image("Negative_sentiment2.png")

In this code block, we first get the selected model’s name and use it to load the appropriate tokenizer and pre-trained model using the Hugging Face Transformers library.

Next, we tokenize the input text using the tokenizer, which returns a dictionary containing the input IDs, attention mask, and token type IDs.

Then, we make a forward pass through the pre-trained model using the input IDs and attention mask tensors. The outputs contain the predicted logits for each class.

We then use the logits to get the predicted class and associated score for the input text. We also compute the scores for all three sentiment classes: positive, negative, and neutral, and the confidence level of the prediction.

Finally, we display the predicted sentiment class, associated score, confidence level, and an emoji corresponding to the predicted class. We also display an image that reflects the predicted sentiment class.

Displaying the Results in the UI with Altair Charts

After analyzing the user input text using pre-trained models, the next step is to display the results in the UI with Altair Charts. Altair is a declarative visualization library for Python that allows you to easily create interactive visualizations from data.

In the code snippet provided earlier, we imported Altair with the line:

import altair as alt

            results_df = pd.concat([results_df, all_scores_df], ignore_index=True)

            # Create the Altair chart
            chart = alt.Chart(results_df).mark_bar(width=50).encode(
                x="Sentiment Class",
                y="Score",
                color="Sentiment Class"
            )

This code creates a dataframe called sentiment_df that contains the sentiment scores for each sentence in the user input. We then create an Altair chart using this dataframe, where the x-axis is the sentiment score and the y-axis is the sentence. We sort the y-axis in descending order of the x-axis values, so that the sentences with the highest sentiment scores appear at the top of the chart. Finally, we add a title to the chart with the properties() method.

            # Display the chart
            with col2:
                st.altair_chart(chart, use_container_width=True)
                st.write(results_df)

This will display the chart in our Streamlit app.

Similarly, we can create other types of charts with Altair, such as scatter plots, line charts, and area charts. Altair also allows you to add interactivity to your charts, such as zooming, panning, and hovering. To learn more about creating visualizations with Altair, you can refer to the Altair documentation.

Adding Images to the UI

When building a user interface, adding visual elements such as images can greatly enhance the user experience. In this context, images can be used to complement text, display charts, or provide feedback to the user.

In the code snippet below, images are added to the UI based on the predicted sentiment of the input text. If the sentiment is positive, a positive sentiment image is displayed along with the sentiment score, class, and confidence level. If the sentiment is neutral, a neutral sentiment image is displayed. Similarly, if the sentiment is negative, a negative sentiment image is displayed.

To add images to the UI, we can use the st.image() function provided by Streamlit. The function takes the image path as an argument and automatically displays it on the UI. In the example below, the image path is hardcoded for each sentiment class. However, in practice, we may want to make the path dynamic based on user input or other programmatic criteria.

                # Print the predicted class and associated score
                st.write(f"Predicted class: {predicted_class}, Score: {score:.3f}, Confidence Level: {confidence_level:.2f}")

                # Emoji
                if predicted_class == 2:
                    st.markdown("Sentiment: Positive :smiley:")
                    st.image("Positive_sentiment.jpg")
                elif predicted_class == 1:
                    st.markdown("Sentiment: Neutral :😐:")
                    st.image("Neutral_sentiment.jpg")
                else:
                    st.markdown("Sentiment: Negative :angry:")
                    st.image("Negative_sentiment2.png")

Testing the Webapp:

Covid Vaccines Tweet Sentiment Analysis - a Hugging Face Space by Abubakari

Discover amazing ML apps made by the community

huggingface.co

HomePage: Select your Model of Choice:

Type in your text and click on Analyze button.

The results will be provoded as follows;

Type in a different text and click on Analyze;

The result is as shown below;

Conclusion and Future Directions

Sentiment Analysis: What is it and Why Does it Matter? | upGrad blog

In conclusion, we have explored how to build a sentiment analysis app using pre-trained models, Python, Streamlit, and Altair. We learned how to preprocess user input text, use pre-trained models to analyze text sentiment, and display the results using interactive charts and images. This app can be useful for companies looking to monitor customer feedback, sentiment analysis researchers, and anyone interested in analyzing text sentiment.

There are several directions we can take this project to improve its functionality and performance. One of the ways we can improve the app’s performance is by training our own sentiment analysis model using custom data that is specific to our domain. This will improve the model’s accuracy and ensure that it is optimized for our specific needs.

Additionally, we can add more features to the app, such as the ability to analyze sentiment across multiple languages or the ability to analyze sentiment in real-time. We can also explore ways to improve the user interface, such as adding more interactive elements and improving the visualizations.

Overall, sentiment analysis is a valuable tool for businesses and researchers, and building an app using pre-trained models and Python is a great way to get started. With some additional development and optimization, this app has the potential to be a powerful tool for analyzing sentiment across a wide range of applications.

Try your hands on the webapp here: https://huggingface.co/spaces/Abubakari/Covid_Vaccines_Tweet_Sentiment_Analysis