How to Recommend — Twitter and Kariyer.net

Published in

Kariyer.net Tech

6 min readMay 11, 2023

Recently, Twitter has made their recommendation algorithm public. There is a lot that can be learned from what is clearly the result of a lot of thought, experiment and work. In this post, we will summarize Twitter’s algorithm, the algorithm we use here at Kariyer.net and discuss the components that can elevate our recommendation system.

Images used in this post are all from https://github.com/twitter/the-algorithm unless otherwise stated.

Overview of Twitter’s Algorithm:

Data:

Twitter is a social media platform, so the data that flows through it is quite different from Kariyer.net. This data is collected under three different categories:

Social graph, a graph consisting of nodes (users) and edges (follows)
Tweet engagement, users’ engagements with tweets such as likes and retweets.
User specific data, such as unfollows and mutes.

Features:

Twitter’s graphs are powered by GraphJet, a real-time graph processing library.

SimClusters is a clustering method performed on vectors extracted from a bipartite graph representing “Producers” and “Consumers” to detect communities. Producers are the set of “influencers” within a community, and Consumers are the users that follow these producers.

An adjacency matrix is retrieved from this graph and used to extract vectors for each Producer. These vectors are used to calculate similarity between each Producer via cosine similarity. Metropolis-Hastings is used to detect communities on a weighted graph constructed from these similarities. A total of ~145000 communities are detected.

A new matrix is calculated by multiplying previous matrices, the adjacency matrix of Producers and Consumers, and the matrix constructed by stacking the community affiliation vectors produced by the Metropolis-Hastings algorithm. This new matrix represents each Consumer’s interests in each detected community. This is used to generate recommendations for users. For example, if a user shows a high interest in a community, they will be recommended to follow another Producer highly affiliated with the same community.

“Interested In” matrix constructed from other matrices

TwHIN is a knowledge-graph embedding technique that can produce dense embeddings from user engagement data. Each node can be represented by an embedding vector via TwHIN.

Example knowledge-graph from user engagement data. Source: https://arxiv.org/abs/2202.05387

RealGraph is a gradient-boosting classifier model that is able to predict whether an interaction between two users will occur. This model is trained on user-to-user interaction graph. Interactions can be a user liking the tweet of, or following another user. Each user’s interactions are sampled for a determined time period and these interactions are collected. Afterwards, the time period is incremented by one day. If, in the next day, an interaction occurred again that occurred in the previous set of interactions, that interaction is labeled with 1, if an interaction has not occurred again it is labeled with 0. The classifier model is then trained on this labeled dataset.

TweepCred is an efficient implementation of PageRank algorithm employing Hadoop to determine the influence of each user on the network. This algorithm uses the same user interaction graph as TwHIN.

Trust and Safety models are used to filter out unsuitable content from both the dataset and the Twitter platform itself.

Candidate Sourcing:

Recommendation candidates are generated from three different sources:

Search Index: An efficient inverted-index for search is build on EarlyBird, a real-time search system based on Apache. This search index is updated in real-time and used to generate recommendation candidates for a user from their network.
UTEG (UserTweetEntityGraph): Performing traversals and aggregations on weighted User-Tweet interaction graphs, Out of Network tweet recommendation candidates are generated. This is powered by GraphJet and only 48 hours worth of data is used.
CR-Mixer (Coordinating Layer): A system that manages other steps in candidate sourcing and perform light ranking on them to get most likely candidates before feeding them into the “heavy ranker”.
FRS (Follow Recommendation Service): Using the RealGraph model, possible interactions with other users are predicted. The higher the probability of interaction between users, the more likely those users will be recommended to follow each other.

Ranking and Serving:

Finally, all of the recommendation candidates are ranked by a Neural Network. After applying some filters, the top ranked tweets are shown in a user’s feed. The filters are:

Social Proof: Show tweets from accounts a user has interacted with.
Author Diversity: Don’t recommend tweets from the same author consecutively.
Visibility: Hide tweets from accounts the users has muted or blocked.
Content Balance: Choose from both In and Out of Network candidates.
Feedback Fatigue: Reduce the ranking of tweets a user has interacted with negatively.

Overview of our Algorithm:

Here at Kariyer.net, we are mainly concerned with job advertisements and users interactions with them. User interactions can be viewing an ad or applying to one. Job advertisements are already categorized by their Job Titles and Sectors.

The recommender system we created have two main components: an offline engine that calculates the similarities between each Job Advertisement pairs, and an online service that can rank and serve recommendations for users in real time.

Offline:

We model users’ application data as a knowledge graph powered by Neo4J. This knowledge graph contains 2 types of nodes, Candidates and Jobs. If a Candidate has applied to a Job, then there is an edge present between those two nodes.

A small subset of the constructed knowledge graph

To calculate similarities, we use link prediction techniques between Job nodes. Normally there wouldn’t be any links forming between Job nodes, however by using techniques that predict new links based on common neighbors we can calculate the probability of a new link occurring between two Job nodes. We take this probability between 0 and 1 as the similarity value between nodes.

Online:

Online service responds in real time whenever a user is to be shown personalized recommendations in our website. This service takes a user’s past job applications (or viewed advertisements if no applications are present). Afterwards, it takes the most similar n number of advertisements to each of the advertisement in the user’s history, putting them in separate lists. It then normalizes each of these lists so that their similarity scores sum up to 1. These lists are then combined into a single list and ranked by their normalized similarity scores. Candidates are shown recommendations in the order of these scores.

What can we learn?

Twitter uses knowledge graphs to generate recommendations extensively, as do we. While they generate embeddings for each entity in their graphs and use cosine similarity on those embeddings, we use link prediction as a similarity metric. However, graph embeddings has the ability to contain much more information within. Adding more node types, or perhaps adding features to nodes via feature extraction from job advertisement/resume text might result in a graph containing more meaningful information. This addition of information will result in more accurate similarity scores to be calculated.

The filters that Twitter’s algorithm applies at the end of ranking the recommendation candidates can be applied to our platform as well. For example, the Feedback Fatigue filter, where the users are less likely to be shown items that are relevant to the ones they negatively interacted with is suitable for our use case. Even though most of our interaction data is implicit (viewed an ad, clicked away without applying, etc.) rather than explicit (liked or retweeted a tweet, blocked or muted someone, etc.) we can make use of this filter to serve better recommendations.

How to Recommend — Twitter and Kariyer.net

Written by Anıl Özlü