Jinal Kalpesh Shah
Web Mining [IS688, Spring 2021]
4 min readMar 24, 2021

--

Twitter Follower Recommendation Engine

On Twitter, we follow a bunch of people. These are people we have either met in person and want to keep in touch with or am interested in following — or both!

Now we don’t attend many conferences and event, also due to pandemic, everything went online. So finding more people to follow that way is slow going; we probably only meet a handful of new people.

The Interesting people to follow on the internet is also limitless. The problem is finding them in a virtual sea of bots, marketing-only accounts, and complainers.

Twitter does have a “Who to Follow” page, and while I’m sure it has some great suggestions, I don’t necessarily trust all of Twitter’s recommendations. It’s kind of like trusting a site like Yelp for restaurant reviews when the local McDonalds rates better than your favorite burger place. So I have decided to create more personalized recommendations.

So that leads me to this project: I decided to find better personalized recommendations on Twitter by looking at who my friends are following that I am not.

Getting the Data

To start, I needed a list of the people I follow on Twitter, and then a list of who they follow. The proper way to do this is to use the official Twitter API , so I wrote some Python code to do exactly that.

Essentially it uses the friends/list endpoint to download all of my Friends. I then went through each of those Friends and found all of their Friends.

Graph Analysis with NetworkX

Once we had the data downloaded, it was time to find relationships between my friends and the people they follow. For this, I decided to use an open source python library called NetworkX. It helps perform complex network analysis.

NetworkX uses a graph structure to help with its analysis. A graph is made up of of nodes and edges. In our case, the Twitter users are our nodes, and our edges are the relationships. The first thing I did was load all of the people I follow and created a directional edge to indicate the one-way relationship from me to them. Next, I looped over all those people’s friends, adding additional nodes and edges.

At this point, my graph had 125,000 nodes which was way too many to draw quickly on my computer. I figured not all of this data would be useful and I needed a way to filter it.

I started by filtering out any of my friends’ friends who happen to already be my friends — this wouldn’t be useful since I already follow these people.

Next, I decided to keep only the top 50 most followed people of Twitter users. This would limit the processing needed while still giving me a list of the most followed Twitter users that my friends follow that I don’t currently follow.

Now we are able to use NetworkX’s drawing capabilities to beautifully render the network of users and relationships.

Visual Representation of Network of users and their relationships
Visual Representation Of Network of users and their relationships

--

--