CVE-2023–29218:Twitter Recommendation Algorithm Vulnerability
The Twitter Recommendation Algorithm through ec83d01 allows attackers to cause a denial of service (reduction of reputation score).
--
Summary
The Twitter Recommendation Algorithm through ec83d01 allows attackers to cause a denial of service (reduction of reputation score) by arranging for multiple Twitter accounts to coordinate negative signals regarding a target account, such as unfollowing, muting, blocking, and reporting, as exploited in the wild in March and April 2023.
How Does Twitter Algorithm Work?
Twitter wants to provide you with the greatest current events the globe has to offer. The nearly 500 million Tweets that are posted every day must be reduced to a small number of the best Tweets in order for them to appear on your device’s For You timeline.
We will go into depth about the many interrelated services and tasks that make up the recommendation system in this post. Although there are several places in the app where Tweets are suggested (Search, Explore, Ads), this essay will concentrate on the For You feed on the main timeline.
Vulnerability Details
The Twitter Recommendation Algorithm through ec83d01 allows attackers to control the recommendation system by coordinating negative signals towards a particular account.
The recommendation algorithm assesses an account’s trustworthiness and relevancy using a variety of signals. These signs include the number of followers, engagement rate, and frequency of tweets.
The vulnerability occurs when many Twitter accounts coordinate negative messages toward a target account. These negative messages might include unfollowing, muting, banning, and reporting the target account.
When the recommendation system recognizes these negative signals, it lowers the target account’s reputation score, making it less likely to appear in recommendations and search results.
Attackers can take advantage of this flaw by setting up many Twitter accounts and directing bad signals toward a single target account. This can be accomplished by using bots or by hiring a group of people to carry out unpleasant behaviors.
The coordinated negative signals might result in a considerable decrease in the target account’s reputation score, resulting in a denial of service.
Steps To Reproduce
- Organize a group with a few friends (I have groups with 40+)
- Find a target, and execute the following tasks in order
- They should follow in preparation, and a few days later unfollow first, [just doing this in 90 days intervals also hurts]
- Then they will report a few “borderline” posts.
- Then they will mute.
- Then they will block.
Relevant code:
val blocks: SCollection[InteractionGraphRawInput] =
GraphUtil.getFlockFeatures(
readSnapshot(FlockBlocksEdgesScalaDataset, sc),
FeatureName.NumBlocks,
endTs)
val mutes: SCollection[InteractionGraphRawInput] =
GraphUtil.getFlockFeatures(
readSnapshot(FlockMutesEdgesScalaDataset, sc),
FeatureName.NumMutes,
endTs)
val abuseReports: SCollection[InteractionGraphRawInput] =
GraphUtil.getFlockFeatures(
readSnapshot(FlockReportAsAbuseEdgesScalaDataset, sc),
FeatureName.NumReportAsAbuses,
endTs)
val spamReports: SCollection[InteractionGraphRawInput] =
GraphUtil.getFlockFeatures(
readSnapshot(FlockReportAsSpamEdgesScalaDataset, sc),
FeatureName.NumReportAsSpams,
endTs)
// we only keep unfollows in the past 90 days due to the huge size of this dataset,
// and to prevent permanent "shadow-banning" in the event of accidental unfollows.
// we treat unfollows as less critical than above 4 negative signals, since it deals more with
// interest than health typically, which might change over time.
val unfollows: SCollection[InteractionGraphRawInput] =
GraphUtil
.getSocialGraphFeatures(
readSnapshot(SocialgraphUnfollowsScalaDataset, sc),
FeatureName.NumUnfollows,
endTs)
.filter(_.age < 90)
// group all features by (src, dest)
val allEdgeFeatures: SCollection[Edge] =
getEdgeFeature(SCollection.unionAll(Seq(blocks, mutes, abuseReports, spamReports, unfollows)))
val negativeFeatures: SCollection[KeyVal[Long, UserSession]] =
allEdgeFeatures
.keyBy(_.sourceId)
.topByKey(maxDestinationIds)(Ordering.by(_.features.size))
.map {
case (srcId, pqEdges) =>
val topKNeg =
pqEdges.toSeq.flatMap(toRealGraphEdgeFeatures(hasNegativeFeatures))
KeyVal(
srcId,
UserSession(
userId = Some(srcId),
realGraphFeaturesTest =
Some(RealGraphFeaturesTest.V1(RealGraphFeaturesV1(topKNeg)))))
}
// save to GCS (via DAL)
negativeFeatures.saveAsCustomOutput(
"Write Negative Edge Label",
DAL.writeVersionedKeyVal(
dataset = RealGraphNegativeFeaturesScalaDataset,
pathLayout = PathLayout.VersionedPath(opts.getOutputPath),
instant = Instant.ofEpochMilli(opts.interval.getEndMillis),
writeOption = WriteOptions(numOfShards = Some(3000))
)
)
// save to BQ
val ingestionDate = opts.getDate().value.getStart.toDate
val bqDataset = opts.getBqDataset
val bqFieldsTransform = RootTransform
.Builder()
.withPrependedFields("dateHour" -> TypedProjection.fromConstant(ingestionDate))
val timePartitioning = new TimePartitioning()
.setType("DAY").setField("dateHour").setExpirationMs(21.days.inMilliseconds)
val bqWriter = BigQueryIO
.write[Edge]
.to(s"${bqDataset}.interaction_graph_agg_negative_edge_snapshot")
.withExtendedErrorInfo()
.withTimePartitioning(timePartitioning)
.withLoadJobProjectId("twttr-recos-ml-prod")
.withThriftSupport(bqFieldsTransform.build(), AvroConverter.Legacy)
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
.withWriteDisposition(
BigQueryIO.Write.WriteDisposition.WRITE_TRUNCATE
) // we only want the latest snapshot
allEdgeFeatures
.saveAsCustomOutput(
s"Save Recommendations to BQ interaction_graph_agg_negative_edge_snapshot",
bqWriter
)
I’ve realized that it can be difficult to accurately predict negative signals, especially those that are global in nature. This happens because algorithms are quick to figure out how to manipulate negative feedback, which causes a speedy convergence toward local minima. When you tally up all the bad comments, and your reputation begins to fall toward zero, this pattern becomes clear.
It has come to my attention that an adversary may quickly manipulate negative feedback to force the system into this condition.
If you don’t post, you don’t care about the deboosting. Because the persistent danger can do so, it will also benefit from economies of scale, therefore the remedy must either render it economically unfeasible or restrict its potential effects.
With 10,000 people, limiting the number of blocks has little effect; even with 100 blocks per year, you can produce one million blocks. You must at least change the weight by a similar margin (multiply by 0.000001) if a single actor (botnet) can perform the same number of tasks for less money.
Reference: