„Filterbub“ — Analyze your political filter bubble!

Inside.TechLabs
TechLabs
Published in
6 min readSep 2, 2020

This project was carried out as part of the TechLabs “Digital Shaper Program” in Münster (Term 2020/01).

Abstract: Political debates in these times often take place in emotional and unobjective atmospheres. One reason for this is that people tend to create their own “filter bubbles”, especially on social media. Social networks as Twitter allow to follow only public individuals with shared opinions and values. However, in order to form clear positions and values, it is important to be able to assess and understand one’s own filter bubble and its political imprint. Our project “Filterbub” directly addresses this issue. The aim of the application is to examine people’s filter bubbles on Twitter and to identify its political imprint. To do this, tweets of members of the German Bundestag were crawled and analyzed in order to create, inter alia, word clouds.

Introduction

In times of Brexit, Donald Trump as president of the U.S., or Covid-19, political debates become emotional, heated and unobjective. Many people only take one opinion and are not willing to accept views of dissidents. One reason for this phenomena is the (unconscious) movement in environments where one only receives consent and interacts with people that share similar opinions and values. These environments are also referred to as “filter bubbles”. These especially occur in social media networks where people are able to create their own filter bubble by choosing whom to follow or become friends with. Our project “Filterbub” addresses this issue and wants to help people to understand the filter bubble they are moving and operating in. We concentrated on the social network “Twitter”. Our goal was to create an app which makes it possible to analyze the filter bubble of any (at first German) Twitter user. The idea was that one could search for any Twitter user and would then get some visuals and statistics which give information on the filter bubble of that user in terms of political imprint. Our idea in particular was to display visuals as word clouds which would show the most used words tweeted by members of the German parliament, since we found a way to crawl tweets of these politicians. Furthermore, to analyze the filter bubble of a Twitter user, our idea is that this Twitter user would not have to follow members of the German parliament by himself. We rather want to analyze the most tweeted, most significant words of the politicians. For a politician of the green party “Die Grünen”, e.g., this could be “Nachhaltigkeit” (which means sustainability in German). If the word “sustainability” then appears particularly frequently in the timeline of a Twitter user, it can be assumed that this user has a “green political imprint”. In this way, we want to help Twitter users to find out the political imprint of their filter bubble, so they can evaluate the environment they are operating in.

Methods

Because of our team structure (4 participants of the Data Science Track and 1 participant of the Web Development Track), we basically worked in the fields of Data Science and Web Development in parallel, despite meeting approximately every 2–3 weeks via Zoom.

a) Data Science

In the field of data science, we made use of a crawler which allowed it to crawl tweets of any twitter user. At first, we manually created a list of 507 twitter accounts of the members of the Bundestag with information regarding tweet accounts, names, parties, total number of tweets, and the last date of tweet. We save it as a “Twitter Account.xlsx” in the folder of data. These data also had to be cleaned to make them usable. The cleaning of the tweets included several aspects. It was necessary to remove “@ something” (when a politician tweeted something and addressed someone), to remove non-characters and non-numbers, to remove hyperlinks in tweets and to remove numbers. We also had to lowercase all words and remove English and German stopwords. Furthermore, we had to deal with the lemmatization of German words. Therefore, we made use of a tagger from the Hochschule Hannover (https://github.com/wartaal/HanTa). However, it cannot successfully lemmatize all tweets, which is probably due to the mix of German and English words.

Also, in order to get descriptive statistics, we split hashtags and plotted the top 10 frequently used hashtags. We also plotted the top 10 most tweeted accounts.

Furthermore, another main point was to deal with text mining of Twitter.

b) Web Development

In terms of web development, our idea was to create a single page website. After collecting some ideas and making some first drafts, we started to create the webpage. One inspiration website of many was “webkid.io”.

Our idea was that the website consists of 4 sections.

The first section called “Analyse” represents the main product and is placed at the top of the application (see screenshot). Our plan is that eventually one could search for the twitter username, click “Suchen” and get several results which would then display the political imprint of the timeline.

Furthermore, the other three sections are “Projekt”, “Team” and “Kontakt”. The “Projekt” section displays a description of the project, so new users could get an idea of what we aim to do with our app. The “Team” section shows images of us, the 5 team members, and displays our names. Finally, the “Kontakt” section consists of a contact form where one could contact us if our project is interesting for users, the media or someone else.

The app is created with React as a JavaScript framework which makes the website very fast and responsive. We implemented a component called “React-Scroll” which has the effect that, when the user clicks on one of the 4 possible options (Analyse, Projekt, Team, Kontakt) in the navigation bar, it scrolls fluently to this specific section.

Results

As an outcome of our data science work, we have been able to download around 30,000 tweets, formatted and cleaned them. This makes them suitable for further analysis using ML methods.

As a result of our web development work, despite the fact that some content and functionality surely must be changed, improved or added, but it can be said that a solid foundation stands for the application, so it could be used in the future.

All in all, we succeeded in implementing knowledge which was gained throughout the track and are prepared to do the integration of the Python script with the website.

Nevertheless, the crucial point of the project is the integration of the Python script with our data into the website to make the tool usable. Due to several issues regarding time management and technical problems, at the end of the project phase, we did not manage to finish our project. For example, we had some problems downloading all of the tweets, which we could not solve before the end of the time frame of the project. There are also technical restrictions from Twitter to download a large number of tweets for a rather long time period.

However, some team members as well as Felix, our data science mentor, already expressed their interest to continue the project after the end of the project phase, since we like the idea and think that it could be a very interesting tool in modern times of social media and emotional political debates.

The next steps will be to solve the remaining issues regarding downloading the tweets, adding some functions and content to the website, and we need to link the Python script to the website, for which we want to use an application called Flask.

We hope that we can then make the application usable and perhaps even publish it to make a small contribution to improving today’s culture of debate.

GitHub-Link: https://github.com/techlabsms/ms-st-20-12-filterblase

The Team:
Greta Lenzing: Data Science (with Python)
Lingling Tong: Data Science (with Python)
Kevin Refenius: Data Science (with Python)
Simon Haastert: Data Science (with Python)
Nils Köhl: Web Development

Mentors:
Felix Kleine Bösing
Lukas Hoppe

--

--

Inside.TechLabs
TechLabs

Our community Members share their insights into the TechLabs Experience