Deciphering Stances on Twitter: Enhancing Sentiment Analysis with BERT Adapters

6 min readDec 31, 2023

In today’s digital age, Twitter has become a bustling hub for real-time news and conversations. However, this rapid flow of information also brings a tide of rumors, making it increasingly challenging to distinguish fact from fiction. As an enthusiast in the field of natural language processing (NLP), I embarked on a fascinating journey to tackle this issue using a powerful tool in the AI world: BERT adapters.

BERTweet, a variant of the renowned BERT model specifically fine-tuned for Twitter data, stood as my model of choice. Its unique architecture, pre-trained on a massive corpus of social media text, was primed for understanding the nuances of tweets. But the real game-changer in my endeavor was the incorporation of a custom adapter module — a lightweight, flexible layer designed to supercharge BERTweet’s capabilities in rumor detection.

Brief Overview Data and Labels:

In SemEval-2017 Task 8, participants were given conversation threads from Twitter, centered around rumored information. The challenge was to analyze these tweets and classify them into one of four interaction types, reflecting the stance of each tweet in relation to the rumor:

Support: The tweet agrees with the rumor.
Deny: The tweet disagrees with the rumor.
Query: The tweet asks for more information about the rumor.
Comment: The tweet comments on the topic without a clear stance on the rumor’s veracity.

This task focused on understanding the public’s reaction to rumors by categorizing their responses.

Streamlining Twitter Data for BERTweet Analysis

Before diving into the intricacies of sentiment analysis with BERTweet, the raw Twitter chatter needs a bit of tidying up. The goal? To transform these tweets into a format that BERTweet can easily digest. Here’s a snapshot of the cleanup routine:

Simplifying Language: All tweets are toned down to lowercase for consistency. User mentions get a makeover with a universal ‘@USER’ tag, making everyone equally anonymous.
Taming URLs: Those lengthy, distracting URLs? They’re neatly swapped out with ‘HTTPURL’. It’s like saying, “There’s a link here,” without the clutter.
Deep Cleaning: This is where the magic happens. Emojis turn into plain text (so long, smiley faces, hello words!), while any stray HTML tags and random brackets get shown the door.
Finishing Touches: The tweets are then polished by removing digit-laden words but keeping those all-important hashtags intact. What’s left is a stream of neat, orderly text, free from the chaos of extra spaces and irrelevant characters.

This prep work is more than just housekeeping; it’s about setting the stage for BERTweet to perform at its best, ensuring each tweet is a clear, concise piece of the sentiment puzzle.


def text_preprocessing_pipeline_v4(text):
    '''Clean and transform the tweet by adding special tokens.'''
    text = str(text).lower()
    # Replace user mentions with @USER
    text = re.sub(r'@\w+', '@USER', text)

    # Replace URLs with HTTPURL
    text = re.sub(r'https?://\S+|www\.\S+', 'HTTPURL', text)

    # Clean the text
    text = clean_text(text)

    # Correct spelling (optional, depending on your needs)
    # text = correctSpelling(text)

    # Remove unnecessary white spaces
    text = remove_space(text)

    return text

# Example usage
input_tweet = "SC has first two presumptive cases of coronavirus #COVID19, DHEC confirms http://example.com via @johnsmith 😢 #StaySafe"
processed_tweet = text_preprocessing_pipeline_v4(input_tweet)
print(processed_tweet)

"sc has first two presumptive cases of coronavirus #covid19, dhec confirms HTTPURL via @USER cry #staysafe"

Crafting the Adapter for BERTweet

Adapter vs Fine-Tuning: The Big Picture Unlike traditional fine-tuning, where the entire model is tweaked, adapters offer a more focused approach. They’re like specialized mini-models inserted into BERTweet, trained only on the task at hand — in this case, deciphering stances in tweets. This means less computational load and a more targeted learning process.

Training Approach:

Label Mapping: I started by mapping the stance categories (‘comment’, ‘support’, ‘deny’, ‘query’) to numerical labels. This step is crucial for the model to recognize and differentiate between the stances.

label_mapping = {'comment': 0, 'support': 1, 'deny': 2, 'query': 3}

df_train['label'] = df_train['output'].map(label_mapping)
df_val['label'] = df_val['output'].map(label_mapping)
df_test['label'] = df_test['output'].map(label_mapping)

Dataset Preparation: The tweets were processed and transformed into a format suitable for BERTweet, involving tokenization and the application of attention masks.

for split in dataset.keys():
    dataset[split] = dataset[split].map(
        encode_batch,
        batched=True,
        remove_columns=["input", "output","__index_level_0__"],  # Adjust according to your dataset
        load_from_cache_file=False
    )

Adapter Integration: The adapter was added to BERTweet, configured specifically for our four-category stance detection task.


# Load the BERT configuration with custom settings
config = AutoConfig.from_pretrained(
    "vinai/bertweet-base",  # Specifies the pre-trained model, in this case, BERT base uncased (110M parameters)
    num_labels=4,  # Number of labels for classification (2 in this case, representing binary sentiment classification)
    id2label={0: "comment", 1: "support", 2: 'deny', 3: "query"}  # Mapping from label indices to human-readable labels
)

# Load the pre-trained BERT model for sequence classification with the above configuration
model = AutoModelForSequenceClassification.from_pretrained(
    "vinai/bertweet-base",  # Specifies the pre-trained model to use
    config=config  # The configuration object defining model parameters
)

# Load the tokenizer for the 'bertweet' model
tokenizer = AutoTokenizer.from_pretrained("vinai/bertweet-base")


# Adding an adapter to the model
model.add_adapter("sst")  # Adds an adapter with the name 'sst'

# This step configures the model to update only the adapter parameters during training.
model.train_adapter(["sst"])  # Specifies that only the 'sst' adapter should be trained

# This step tells the model to use the specified adapter(s) during inference and/or training.
model.set_active_adapters(["sst"])  # Activates the 'sst' adapter

Targeted Training: Using an AdapterTrainer, the training was focused solely on the adapter’s parameters. This method ensures that the core BERTweet model remains unchanged, while the adapter learns from the stance-labeled tweet data.

trainer = AdapterTrainer(
    model=model,
    args=training_args,
    train_dataset=dataset["train"],
    eval_dataset=dataset["validation"],
    compute_metrics=compute_accuracy,
    callbacks=[loss_logging_callback]
)

If you want to check the full code or use it for your own projects click here! GitHub: https://github.com/RogerGMarch/Stance_Detection_SemEval_2017/blob/main/Transfer_Learning_for_Sentiment_Classification_Twiiter.ipynb

Results Overview of Stance Detection with BERTweet Adapter.

We found that arround 100 epochs is the sweetspot for our dataset. Here are the results.

The adapter’s performance with BERTweet for stance detection shows promise, especially considering the constraints of limited data.

We have to mention that traditional machine learning models like XGBoost tend to overfit to the most common class — in this case, ‘comment’ — due to the scarcity of examples for ‘support’, ‘deny’, and ‘query’.

As we can see here we have very limited data in deny and query labels.

However, the adapter approach demonstrates its strength in handling such imbalanced datasets, achieving reasonable precision and recall across different stances, unlike conventional models which might default to the majority class. The lower performance in the ‘deny’ category highlights the challenge of limited data but also underscores the adapter’s relative robustness in more nuanced classification tasks.

Conclusion

The deployment of a BERTweet adapter in our stance detection task was a calculated move to overcome the constraints of limited data — a common challenge in machine learning. Stance detection is vital in deciphering the tone of conversations on social media, where every nuance matters. Despite the data scarcity, particularly for ‘deny’ and ‘query’ categories, the adapter proved to be a robust solution, avoiding the pitfalls of traditional models that often default to the most common class. This endeavor not only showcased the adapter’s capability to extract meaningful insights from sparse datasets but also underscored its potential in enhancing the accuracy of sentiment analysis in the digital communication landscape.