Twitter Algorithm is now public. Here’s what I learned.

I went through > 400,000 lines of code so you won’t have to.

Sarvesh P.
ILLUMINATION
5 min readApr 2, 2023

--

I’d hate to waste your time, so here’s a very short version of this entire article in a twitter thread. To understand this in depth, keep reading.

Introduction

Twitter is one of the most popular social media platforms in the world, and it is constantly evolving to provide its users with the best possible experience. One of the key ways in which Twitter achieves this is through its recommendation algorithm, which is designed to select the most relevant and engaging tweets for a user’s timeline.

The algorithm consists of mind-boggling 48 million parameters which has evolved over 2 decades of marvelous engineering serving 150 billion tweets to devices.

Recommending tweets

The system is made up of three main stages: candidate sourcing, ranking, and filtering. In the candidate sourcing stage, Twitter uses several different candidate sources to retrieve recent and relevant tweets for a user. These sources include people you follow (in-network) and people you don’t follow (out-of-network).

These models aim to answer important questions about the Twitter network, such as, “What is the probability you will interact with another user in the future?”

Graphical Unit
Helper Image from Twitter’s blog

Let’s understand this step by step.

Candidate Sourcing

Twitter runs a request 5 billion times a day and completes in under 1.5 seconds on average. Yes that’s right, 5 followed by 9 zeros. For each request, it extracts the best 1500 Tweets from a pool of hundreds of millions through these sources.

Twitter finds candidates from people you follow (In-Network) and from people you don’t follow (Out-of-Network). Today, the For You timeline consists of 50% In-Network Tweets and 50% Out-of-Network Tweets on average, though this may vary from user to user.

In-Network is when Twitter looks at tweets from people you follow and shows you the most relevant ones based on a model that ranks them by relevance. Twitter also uses a model called Real Graph to predict how likely you are to engage with a particular tweet based on your interactions with the author of the tweet.

Out-of-Network is when Twitter tries to find relevant tweets from people you don’t follow. To do this, they use two approaches. The first is to analyze the engagement of the people you follow and look for tweets similar to those they engage with. The second is to use a method called Embedding Spaces, where they generate numerical representations of users’ interests and tweet content to find similarities between users, tweets, and user-tweet pairs.

Twitter also has a tool called SimClusters, which finds communities of influential users based on custom algorithms. Tweets can be embedded into these communities based on their current popularity within that community.

Ranking

Twitter wants to show you Tweets that are interesting and relevant to you. They start by gathering around 1500 Tweets that they think might be good candidates. Then, they score each Tweet to predict how relevant it might be to you. This score is the main way that Tweets are ranked on your timeline.

To rank the Tweets, Twitter uses a big computer program called a neural network. This program has around 48 million parts that work together to make predictions about how people will interact with Tweets. It looks at lots of different factors to make these predictions, like whether a Tweet has been liked or shared by other people. Based on these predictions, the neural network gives each Tweet a score.

Finally, Twitter takes all of the scores and sorts the Tweets from highest to lowest. The Tweets with the highest scores get shown to you first, so you can see the ones that are most likely to be interesting or relevant.

Filtering

After ranking the relevant tweets based on their scores, Twitter applies a set of filters and heuristics to ensure that your feed is balanced and diverse. These filters are used to implement various product features that provide a better user experience. Here are some examples:

  • Visibility Filtering: Twitter filters out tweets based on their content and your preferences, such as blocking or muting tweets from specific accounts.
  • Author Diversity: The platform ensures that your feed doesn’t have too many consecutive tweets from a single author.
  • Content Balance: Twitter makes sure that your feed has a fair balance of in-network and out-of-network tweets.
  • Feedback-based Fatigue: If you have provided negative feedback around a certain tweet, Twitter will lower its score to avoid showing it to you again.
  • Social Proof: Twitter only includes out-of-network tweets that have a second-degree connection to you, ensuring that someone you follow engaged with the tweet or follows the tweet’s author.
  • Conversations: Replies to a tweet are threaded together with the original tweet to provide more context to the conversation.
  • Edited Tweets: Twitter determines if the tweets currently on your device are outdated and sends instructions to replace them with the edited versions.

What will help and what won’t

There are some things you can keep in mind while tweeting which might help to boost and deboost your tweets.

Boosts:

  • Replies to tweets increase the chances of recommendations by 1x.
  • Including images or videos in tweets can boost recommendations by 2x.
  • Twitter Blue, a paid subscription service, can boost recommendations by 2–4x.
  • Being part of a trusted circle, or having a group of users who frequently engage with your tweets, can boost recommendations by 3x.
  • Retweets from other users can boost recommendations by 20x.
  • Likes on tweets can boost recommendations by 30x.

Deboosts:

  • Tweets that only contain a URL with no accompanying text may be deboosted in recommendations.
  • Tweets with no text or very little text may also be deboosted in recommendations.

Conclusion

In conclusion, the Twitter algorithm is a complex system consisting of candidate sourcing, ranking, and filtering stages. The algorithm is designed to provide users with the most relevant and engaging tweets for their timelines.

Twitter uses various models and techniques, such as Real Graph, Embedding Spaces, and SimClusters, to find and rank relevant tweets. The neural network plays a critical role in ranking tweets based on their relevance and potential interactions.

Finally, Twitter applies a set of filters and heuristics to ensure that users have a balanced and diverse feed. Understanding how the Twitter algorithm works can help users make the most of the platform and engage with the content that matters to them.

--

--