What I discovered after analyzing 10,000 medium posts with Python.

9 Tips on how to increase readers' engagement, earn more claps and views.

Joachim Kuleafenu
Analytics Vidhya
Published in
11 min readSep 8, 2021

--

Photo by Road Trip with Raj on Unsplash

NB: A link to the project source code Is provided here as well as in the conclusion section.

“more claps == more views”

Yes, that is it, but why!.

Do you know that a medium algorithm, aside from getting your article being published by a big publication to drive more traffic, ranks your article by evaluating the number of claps given by users on an individual basis?

Simply put, the higher the number of unique claps you get, the better you are ranked, which apparently impacts the amount of money you earn.

Also, a research article published by Harrison Jansma after analyzing a million posts indicated that 61.3% of medium articles earn less than 10 claps — which is undoubtedly discouraging.

If so, then the question is, how do we increase user engagement, earn more views and claps from readers?

Let’s dig in!

Project goal: Analysing 10 thousand medium posts to derive insights on how to tweak your article to earn more claps from your readers.

Things to learn

  1. About the dataset in use.

2. Summary statistical report on data.

3. Answering the 9 most pressing questions.

4. Conclusion.

About the dataset in use.

Scrapping dataset from towardsdatascience archive

To get data for this project, I built a web scrapper with python (you can read on how to do so here) to scrape datasets from the TDS data archive.

In all, 13K data points were collected and after the necessary preprocessing steps, it was left with 12K.

Summary statistical report on data

I got some interesting findings during my statistical summary and am sure they can also be used as an evaluating metric as well.

  • It emerged that the highest claps ever gained on an article within my sample data are 19,700.
  • And on an average basis, medium writers earn about 196 claps on their articles.
  • For your post to be among the top 1% in terms of claps, you must earn at least 1,600 claps.
  • To be part of the top 20%, you must earn at least 225 claps on your post.
  • 50% of all articles earn 80 claps which is 116 lesser than the average number of claps.

I also wanted to know how consistent is it for authors to reach that top 20% threshold. And this is what I had

It turned out that

  • 28.12% of the authors have at least one of their articles reaching 225 claps; which is the top 20 Percent threshold.
  • Less than half of authors that have their initial article reaching the top 20 Percent representing, 10.6%, have their second article reaching as well.

So, the rate at which an author’s second article reaching the top 20 Percent mark is approximately 11%.

Answering the 9 most pressing questions.

Now let's tackle our first four tipping questions.

  1. What are the most used keywords in headers and sub-headers, and which of these keywords are often used in articles with a great number of claps?

Let's assume that data,science and how will appear in the frequently used keywords, now guess what we got?

  • The top five keywords used in both the main and sub-headers are data, learning, python, using and machine.
  • We also noticed that classification, introduction, building, neural, make and build are the least used keywords.

Before we jump to a conclusion to choose these keywords in our next post, let's see which of them drives the highest claps to determine the ones to prioritize.

  • We noted that among the top 20 keywords, there are some 8 of them that appears more often in the top 20% articles.
    These are pandas, 5 (eg Top 5 Open Data Science Competitions with Cash Prizes),Simple, Science,Learn, make,data, python

So to conclude, if not anything, you’ve seen the keywords writers often write about and the ones that are frequently used in the top articles.

Hence, you can consider writing about these 20 top keywords or better still, prioritizing the aforementioned 8 keywords in your headers among others.

Also, note this.

Instead of writing, for instance, state it explicitly. Add a number say 5 to make it more precise.

2. Does having any of the top 20 keywords within your headers increase the number of claps on your article?

Knowing the keywords to prioritize in your header is not necessarily enough. Let's see if using it has the potential of causing a difference.

  • The above plot shows that articles having the keywords in their headers earn an average of 206.39 claps whiles those without having 177.38 claps.

This confirms the fact that at least having one of the top keywords in your header may give you an edge.

Okay, to advance the keyword scenario a little, let us see how these words are used; their combination with other words. This brings us to the next question…

3. What are the most used two keywords combination in our headers and which one can be used to drive more claps?

  • We could see that among the twenty most frequent bigrams (double keywords), the top four are how to, data science, machine learning and in python
    The last four are build a,in data,how i and reinforcement learning .

Ooops, I know what you are about to ask.

How do these bigrams contribute to claps?

To get this, we plot the bi-grams against the average number of claps and the result was, errrm … kinda surprising.

  • How I have the highest average number of claps followed by data scientist, you should then the rest follows.

If you could recall, how i was recorded as one of the least used bi-keywords, yet it attracted the highest average claps.

The reason being that :

How I, is usually used to tell stories, and a study conducted by a neuroscientist “Uri Hasson”, concluded that storytelling causes the neurons of an audience to sync with the storyteller’s brain.

Also humans relate to tales, therefore telling them is a fascinating method to present. Stories engross the listener, elicit empathy, build trust, and inspire action.Read more here.

Examples of how I stories are:

  • How I Became a Data Analyst by Optimizing the Right Place and Time
  • How I got a Data Science Job in Canada
  • How I published an app and model to classify 85 snake species (and how you can too)

Well, I need not tell you the above ‘how Is’ really triggers curiosity.

Looking forward to seeing it in your next post.

4. Okay next question. Should I always include a sub-header or I can choose to ignore it?

During my studies, I recorded that 16 Percent of the articles within my sample dataset do not have sub-headers even though they have main-headers.

So I decided to find out if having a sub-header too can put you in a good position for more claps.

  • After a careful analysis, articles with sub-headers yielded 203 average claps whiles those without sub-headers earned 158.

This makes sense because articles with both the main header and a descriptive sub-header including the right well-targeted keywords appear more easily in their intended reader's searches than otherwise.
Hence;

  • There is an increase in user outreach and overall clickthrough.
  • And also, by reaching the right users you have the chance of earning a clap since that is what s/he will probably be looking out for.

So the takeaway is, write your header using the right keywords and don’t forget to include that descriptive and insightful sub-header.

5. Is there any specific length of headers to use?
With this, I just tried to clear some air here.

Towardsdatascience editors encourage using a short main header and an insightful sub-header.

As earlier discussed, writing a descriptive main and a sub-header helps Google to index your article, which also pops up in search previews thereby attracting more intended users.

In that regard, let's see if there is a recommendable length of headers to use.

The length here is calculated by the number of words present.

  • There are spikes of claps for headers containing 13 to 17 words.
  • An alternating movement is observed in between headers containing 30 to 50 words which stabilize from 60 to the end.
  • The number of claps reduced after a word length of 20 in the main header and 60 in the sub-header.

In conclusion, we can say that there is no specific merit in the header length you choose but you may consider choosing words between 14 to 17 for your main header and 30 to 50 words for your subheader.

But most importantly, know how to choose your words.

6. How does the reading time also affects the number of claps?

I performed an initial distribution of the read time, and it was observed that 87% of the articles have 2 to 10 minutes of reading time.

That is just by the way :)

  • From the above plot, an article that falls in between 20–67 minutes reading time is observed to have the highest average number of claps, followed by the 10–20, 5–10 and lastly, 0–5 minutes.
  • Also, articles with 5 to 20 minutes of reading time have an equal chance of getting similar claps from readers.

Therefore we can say that:

  • It is not advisable to forcibly, shorten your article just because you think a longer article will make your users bored.
  • Also, the above let us understand that the lesser the reading time does not guarantee an appreciation from your readers but arguably the quality of the content.

Just know that every reader is ready to spend as much time on your article as long as it worth the time. So in as much as you try to save readers time, you must also priortize the quality of it.

Just to let you know

I did a manual checking and I found out that most articles with higher reading time are either a top research work of interest or the post are coming from tech giants such as Google or Facebook.

7. Does the number of tags you choose contributes to your claps?

The above is a clear indication of how crucial it is in choosing the number of tags to use in your article.

  • The plot depicts that the more the number of tags, the more the claps.

This simply means, most medium readers follow medium based on topics. So, for the medium SEO to gear your article to many readers following diverse topics, it is recommended that you use as many carefully chosen topics revolving around your article as possible.

  • Also, there is no much difference between the articles with zero tags through to 3 tags, but there is a significant difference between 3 to 4 and 4 to 5.
  • Moreover, it is recorded that 86.20 Percent of articles use 5 tags whiles 8.39 Percent uses 4 tags and then the rest follows.

In all, it is recommended that you use at least 4 tags on your post.

8. Should I embed several images to captivate readers interest in my post?

Hypothesis:
There is a saying “A picture is worth a thousand words”, so we are assuming, the more images you use in your post, the better the readers appreciation.

But see

  • We had a very low correlation value of 0.03 between the image and the claps.

This means the number of images has little to no effect on the reader’s level of appreciation.

  • Articles with and without images showed from the barplot to have an equal average number of claps.

Hence, instead of embedding plenty of images intending to drive readers interest, it is better to use them only when necessary and needed.

9. Is the article’s date of publication a factor to earn more readers interest?

Hypothesis:
We can say that, the article’s date of publication also have influence on its number of reads and claps.
Articles published on weekends earns less claps.

Month

Due to the numerous uncertainties and tragedies during that period, we can’t confidently conclude on the month. However, we can say that some factors like depression, anxiety and stress on users may impact his/her behaviour towards posts.

Weekday

  • Friday recorded the highest number of claps and in contrast, the lowest clap is recorded on Saturday.

So posting your article on Saturday is not recommended as far as these findings are concerned.

Weekend

  • There is a slight increase in the number of claps published on weekdays than on weekends.

So you may consider publishing your article on weekdays.

Conclusion

Just a quick recap since details have already been given.

  • Make very good use of the top 20% keywords in your header.
  • Start writing the How I article. Make it personal and let's see the result.
    Let me know in case you get some outstanding results.
  • Write your header using the right keywords and don’t forget to include that descriptive and insightful sub-header.
  • Lesser reading time does not guarantee an appreciation from your readers but arguably the quality of the content.
  • In all, it is recommended that you use at least 4 tags on your post.
  • Instead of embedding plenty of images intending to drive readers interest, it is better to use them only when necessary and needed.
  • Posting your article on Saturday is not recommended.
  • So you may consider publishing your article on weekdays.

I nearly forgot.

Download the full code from my Github repository and hit that 50 claps.

--

--

Joachim Kuleafenu
Analytics Vidhya

Software Engineer. I build smart features with Machine Learning techniques.