The Data Behind the Perfect Reddit Headline
I analyzed 10M+ Reddit titles and made a tool to optimize submissions
Reddit is probably the biggest social network where one can create huge drama without having a social circle, friends, or followers. You just need to be interesting, provocative, funny, or use some other secret formula.
I wanted to try and understand what this secret formula is all about.
I analyzed data from over 10 million Reddit threads in over 100 subreddits and tried to find interesting correlations. Amazingly, I couldn’t. It’s too chaotic.
Data does start to become interesting when you lose the ambition to analyze the entire network, and go down a level, to the subreddits.
I found that different subs are triggered by different topics, word phrasings, semantics, and tone. Very similar threads, with slightly different titles, perform very differently.
What does perform mean? For my research, it means attracting more upvotes and creating a big discussion in the comments section.
Upvotes, comment counts, and karma are the currency in Reddit’s gamified world, but also the key matrices to measure the performance of your posts and key tools for growth hackers.
Case in Point: r/Lifehacks
Let’s examine a medium-sized community on Reddit. Lifehacks is a fun community with almost four million members. It’s been run since 2008, so quite a lot of data is available for analysis.
An average post in the last couple of years attracts 857 upvotes (for the sake of analysis, any number presented here is skewed a bit to prevent reverse engineering and I eliminated top 5% and bottom 5% of threads to prevent skewing and anomalies) and 41.9 comments.
An average title on r/lifehacks contains 8.7 unique words (not counting prepositions, pronouns, determiners, conjunctions, and interjunctions), so the avg. contribution of an average word is about 98.5 upvotes and 4.8 comments.
Now, the variance is huge. STDV skyrockets here with over 450 for upvotes and almost 11 for comments (that’s after eliminating anomalies and extreme cases).
It’s easy to see that some words and topics trigger upvotes and discussion much more than other, very similar words.
As is very apparent from the chart above, the contribution of different words, even similar ones, to the score (upvotes) varies significantly. Changing the word “effective” to “useful” in your title can lead to a major increase in upvotes and comments. The same logic applies to fast and instant.
Also, looking at word data can lead to interesting conclusions as to the topics that make a community tick. While this list is very partial (taken from over 30,000 words analyzed per community), it’s easy to see what interests members of the lifehack community. They want to solve problems, they want to do things in a creative, fast, innovative way.
If you phrase your titles accordingly, your chances to get more attention to your thread is much higher.
What Else Impacts Your Reddit Performance?
My research found many factors that can either increase or decrease your chances to make it big on Reddit, and all of them are subreddit-dependent.
Some of the more interesting ones include:
- Length of title. How many words have you used? Different subs limit you to different lengths of title, but the impact of making it shorter or longer is actually significant. For example, on r/lifehacks, longer titles actually work better, up to a point. Titles longer than 18 words start to underperform. But title lengths also correlate with something else…
- Presence of an image/video. In some communities, it helps to have an image/video+title. In others, it’s frowned upon. in r/lifehacks, it doesn’t help you. It may even hurt your efforts to make an impact. However, very short titles + an image are a little bit better than short titles without an image.
- Using a question. In many communities, but not all of them, having a question in your title encourages discussion. That seems natural. However, what’s the impact on upvotes? That changes between communities. In r/lifehacks, questions don’t help your cause (unless, of course, your cause is to ask a question). Non-questions get on average 162% more upvotes, and they even draw 27% more comments.
- Using flair. Flair is a feature used on some Reddit communities that allows one to tag your post to a specific niche or topic, for easy filtering. It has different use cases around Reddit. For some communities, it’s the basis for creating sub-subreddits, for others, it’s more like a moderator configuration and is less important.
In communities where flair plays a role, it also impacts your chances for more upvotes and pulling comments.
Because r/lifehacks doesn’t use flair, let’s talk about r/futurology. As a community that discusses various topics, it makes sense for some topics to attract more attention than others. The Environment flair, for example, is a big hit. It gets 240% more upvotes than your average posts and is a little bit better than posts with no flair at all. It also performs well in terms of comments.
- Tone and sentiment. How you phrase your post is super significant. Most of us try not to feed the trolls, but trolls do get fed and enjoy quite a lot of success on Reddit. That’s why I decided to analyze sentiment as well. There are quite a few models that allow for sentiment analysis, and I used the AFINN-165 model for this case.
I tried to split the sentiment of each thread title to one of seven categories (very negative being the worst, and very positive being the best). There is a very big variance between communities in what works best when writing for said community. In r/lifehacks very positive titles are the best for upvotes and for comments. However, you’ll have plenty of luck with neutral sentiment titles as well.
Other communities encourage negativity, and negative titles, filled with words from the AFINN-165 bad wordlist, make much more noise.
Other Factors That I Didn’t Include in My Work
Of course, there are many more factors that can be thought of to impact your Reddit success. Most notably, it’s important what you actually write about and whether it’s good enough and interesting for said community. But there are more:
- Luck. This is probably the number one factor. Who exactly is exposed to your title? Did it reach the right people in the first few hours which are more significant for further exposure on Reddit? Did a volcano just erupt and shift people’s attention from Reddit to the news?
- Time of day. Some analysis has been done on the right time of day and day of the week to post on Reddit. I didn’t include it here.
- Social circles. While Reddit is probably one of the very few social networks that make it possible for an anonymous poster to attract a lot of attention, being popular will help. Recognized users and users with some following have a clear advantage in getting upvotes and comments in their communities.
- Circumventing the system/hacking. People try to manipulate Reddit all the time. Reddit fights them, and they tweak a little and fight back. When you use bots, one can’t predict what will be the results of your efforts. It changes everything. It pushes good content away and puts the spotlight on bad content.
What Did I Do With the Data?
I tried to combine all my findings in one tool that analyzes a given title in a subreddit and predicts how well will it do. While there is no magic involved, it does quite well in back-predicting how titles performed in the past.
You can have a go at Reddit Write. You can use ten popular subreddits for free, and if you find it useful, there’s also a paid version with more features.
In any case, this software is mainly for fun but can be useful for Reddit marketers and growth hackers in general. You can have a go and optimize your Reddit threads, try to change the wordings and you’ll probably do better on Reddit once you do.
Please note: Reddit Write doesn’t check if your title actually makes sense or is interesting. That’s all up to you. It’s meant to be used by humans with good intentions that post interesting, relevant things. If you just want to make damage, please don’t use this tool. It’s not good for hacking, circumventing the Reddit upvote system, or any other non-fair use way you can think of. I’m by no means affiliated with Reddit.