Using Data to Improve any YouTube Channel

Konrad Schmitz
12 min readApr 7, 2023

--

A YouTube Case Study (Python/NLP/Excel/Tableau)

Introduction

There are a few key factors that strongly influence the performance of a YouTube channel. But what happens when these factors constantly change? The fast pace that social media runs nowadays puts creators on the edge, trying to figure out the “rules to the game”.

This study aims to analyse the development of these key factors through the years, and nail down some good practices that can improve any YouTube channel.

Table of Contents

…….▹ Youtube Evolution through the Years

…….▹ The Impact of Video Length

…….▹ Publishing Frequency and Consistency

…….▹ Best Day of the Week to Publish

…….▹Tags

…….▹Title Length

…….▹Title Sentiment Analysis (NLP)

The Scenario

The data was pulled up from a list of channels that I personally enjoy or that I believe to be very influential to the history of YouTube. They range from very niched, to very broad. From small amounts of views and followers all the way to the biggest channels on the platform. It includes data from the channels itself, but also from every video posted since their creation.

My main question was: What are key rules to follow in order to grow a YouTube channel? As I collected, processed and analysed this data, several smaller questions arose.

Collecting and Processing the Data

Collecting the Data

Using Python in Jupyter Lab, and scrapping via the YouTube API; data was pulled from the previously mentioned list of channels. A personal API key was generated using the google cloud environment.

All my code for this project and a complementary project about YouTube introductions are on this GitHub repository. And it is essential to mention that my project was initially based on Thu Vu’s project.

From each channel's id it was possible to run a code that pulls up initial data concerning the channel itself, and therefore data about every video published on each channel in this list.

The final data appended into this dataset were polarity scores from a sentiment analysis ran using the NLTK library, of all the titles in this dataset. This RoBERTa model for twitter was used on this sentiment analysis, where it generated 3 scores, that relate to each title. Those were a positive, a negative and a neutral score all ranging from 0 (lowest) to 1 (highest).

Processing the Data

Having this main data frame, some of the following steps to process it included:

  • Checking for null values
  • Checking column types and converting them to their adequate measure
  • Converting timestamps
  • Generating day of the week that video was posted
  • Calculating number of tags per video
  • Going over the data to check for poorly collected information
  • Checking for whitespaces or typos

Analysing the Data

YouTube evolution through the Years

We can observe on Graph 1 that the amount of videos published kept growing through the years, until 2019, where they started to be uploaded at lesser amounts.

It's almost like a crazy pandemic suddenly hit the planet earth and got people stuck at home facing hard times or something.

Graph 1 — Videos Published, View Count and Sum of Views through the Years

The average Views had a huge boom between 2012 and 2015, at a time when YouTube prioritised channels that posted more often. This concept helped to surge the daily vlog trend, and videos cut from live videos, to keep up with the algorithm at the time. Two huge channels that impact this curve are Casey Neistat (the biggest name in daily vlogging) and PewDiePie (the biggest gammer and channel at the time).

After that, both of these channels openly spoke about reducing the amount of videos they were posting in order to avoid what is now called “creator burnout”. This concept took some years to land with the remaining channels, but the pandemic might have finally brought it to light.

The Sum of Views have a sharp fall after 2019. Raising the possibility that people watched less YouTube videos in total. This might also have been an influence of the emergence of TikTok.

Right below, Graph 2 shows the evolution of TikTok quarterly downloads from 2017 to 2022. We can spot that they rapidly grew starting in 2018.

Graph 2 — TikTok quarterly downloads 2017 to 2022 (provided by https://infogram.com/tiktok-quarterly-downloads-1h0r6rpzwg7gw2e)

The Impact of Video Length

On Graph 3 we can observe a trend: the longer the video, the greater the amount of views, comments and likes.

Graph 3 — Videos Published and Average (Views/Comments/Likes) by Video Length

With peaks at 10 and 15 minutes for all of them. Symbolising the priority that the update of the YouTube algorithm gave to videos longer than 10 minutes. It even put creators in the position of forcefully making longer edits to take advantage of that fact.

Longer videos might have more views, but most videos last 10 minutes or less. I hypothesise that creators who publish videos longer than 11 minutes are simply being true to their content and to their audience, using the length of videos they believe to be best to tell their story.

A good example of that are reaction videos and gameplay videos, both performing well at with longer durations.

Graph 4 — Length of Videos through the Years (Excluding Pewdiepie)

Graph 4 shows the impact that the algorithm update had, when prioritising 10 minute long videos. We also see that videos rapidly grew to the average of 10 and 11 minutes after 2016. This graph excludes PewDiePie for the impact he had on the beginning of the decade.

Publishing Frequency and Consistency

Up next, Graph 5 highlights the relationship between views and amount of published videos. Some channels were filtered off to avoid extreme outliers and build a clearer visualisation, such as: Dude Perfect, Mr Beast and Mark Rober, all which have an insane amount of views in comparison to the frequency they publish.

Graph 5 — Videos Published and Average View Count per Channel

Here the grey line shows the amount of published videos, and the red bars represent the average amount of views. We can instantly notice interesting points, such as Seth James DeMoor being the channel with the greatest amount of published videos in this filtered list, still having less then 26.000 views on average.

This happens with very niched channels, such as those about tech, sports, softwares, mental health and investing. These channels tend to publish more often to keep up with every new detail in their business, and to cover all the latest news surrounding the community.

On the other edge, some channels published a small amount of videos and have some of the greatest average views on the platform. These channels tend to be offering pure entertainment (like Airrack), or well built video essays (like Veritassium), that allow them to have longer videos while keeping the retention.

So how often should you post anyway? Is there any sort of rule or standard to guide us?

Graph 6 and 7 represent a relationship between publishing frequency and channels growth. On Graph 6 big channels that have different schedules and frequency of posting were carefully selected.

Graph 6 — Subscriber Growth and Post Frequency for Bigger Channels

A great example that not posting consistently can halt growth is Casey Neistat's channel. His growth curve nearly stopped, and climbed at a very slow steady pace. At the other end, MrBeast posting at a reasonably consistent rate, grew his audience by almost 400%.

Graph 7 considers the same relationship but taking smaller channels in consideration:

Graph 7 — Subscriber Growth and Post Frequency for Smaller Channels

A channel like Colin and Samir got “lucky” with some videos that performed well and then boomed, even though they didn’t post as often as most channels. At the same time, Seth James DeMoor was posting very consistently, yet still not getting that much traction. Joshua Weissman was able to augment his subscribers by 7 times. He was posting twice a week on average while creating unique cooking videos.

With all that in mind, we can talk about consistency and some frequency being important, but not decisive. It doesn’t matter how niched your channel is, there is always room to grow to a reasonably large audience, and the main driver is always going to be: CONTENT.

Best Day of the Week to Publish

Ok, we've discussed some YouTube history, talked about video lenghts and posting schedule. But is there anyway to pick out the best day of the week to be publishing? Let's take a look at Graph 8:

Graph 8 — Amount of Videos Published per Day of the Week

Weekdays have more videos published, and Wednesday has the biggest amount of videos published. Naturally we wonder: is Wednesday actually a good day to publish?

Graph 9 — Channel Sum of Best Ranked Avg. Views per Day of Week Published

Graph 9 is obtained ranking each channel from best to worst day to publish when in relation to the average views on that day of the week. Yes, I know it sounds a bit confusing, but stick with me.

The graph above presents the added results of each channels best performing day. We could assume that Sunday is the best day to post. But it is not that simple!

This is where a heat map comes very handy. At a quick glance at Graph 10 right below, there is a very low predictability of ranking when considering several channels.

Graph 10 — Channels Ranking of Best Day of the Week to Publish

So if we add up all ranking scores from this heat map, we end up with a progression curve through the week, seen next on Graph 11.

The ideal curve would be one that starts high on amount channels that performed at “best day”, and ends on a low result.

Graph 11 — Publish Ranking per Day of the Week

With that in mind, we can see that weekdays tend to perform slightly better, and that Wednesday indeed stands out as having one of the best shaped curves, along with Tuesday. Sunday turns out to have a very bad shaped curve, considering that it has some of the best averages, but also some of the worst averages.

Tags

Next, we can check the relationship between tags and performance, starting with Graph 12:

Graph 12 — Videos Published by Amount of Tags

The amount of tags per video peaks at 20, and we tend to see small amounts of videos with over 30 tags. Could we say that tags actually impact views or clicks?

Graph 13 — Videos Published and Average Views by Amount of Tags

Presented on Graph 13, some channels have very different relationships between views and amount of tags. Some channels with very niched content might have better performance when more tags are added. An example of that are channels which cover specific products (like running gear), or educational knowledge (such as political topics or physical performance).

At the same time, a channel with content that is mostly not based on the audience's demand (such as a cooking channel or a philosophy channel) have no improvement when adding more tags.

Title Length

It is a mostly known fact that the average title length on YouTube is of 44 characters. It performs well because it displays well on text, both on the mobile and desktop version. But also because it is psychologically accepted as a good amount of text to describe a topic.

Graph 14 — Average Tittle Length per Channel

But what can we observe from some outliers? Here in red, we can see some of the channels that have much shorter titles than average. These are: Casey Neistat, Dude Perfect, MrBeast and Veritassium. All of which have some of the greatest average views on the social media platform.

Perhaps it means that well written shorter titles perform better! Or that they have such a good video topic that they can ignore the idea of using 44 characters.

On purple, we see some channels that have higher character length. Examples here are: Crash Course, Smarter Everyday and WIRED; all working on the education and information sector. Perhaps being able to deliver longer titles at the expense of explaining the content better.

And we also see in purple, two channels that work specifically with teaching how to improve YouTube performance (Film Booth and Think Media). Which just makes me smirk a bit out of curiosity.

In green, we can see some channels that are within the 95 percentile to the mean. Some of them being my favourites and also best produced on the platform, such as Vox and Mark Rober.

Title Sentiment Analysis (NLP)

We then come to an experiment of running a sentiment analysis with the NLTK library on top of all the titles in our list. As seen on Graph 15 most titles are predominantly neutral. And running a quick correlation test between amount of views and these polarity scores, no correlation was found.

Graph 15 — Sentiment Analysis (Positive, Negative, Neutral) per Title

Yet we can observe an interesting evolution of the average polarity scores through the years on Graph 16.

Graph 16 — Sentiment Score through the Years

On average, positive scores kept stable. Neutral scores are roughly stable but decreasing. But the average negative score increased from a very low 0.05 in 2007 to 0.20 in 2023, suggesting that titles are becoming more negative to increase click rate.

This evolution could be justified by this study published on The Nature Journal, which points that readers are 2.3% more likely to click on headline when it contains a negative word.

On Graph 17 we can see how some channels are taking advantage of that. Self improvement and health channels are highlighted in green. It is natural for them to aim at sensitive topics that deal with negative feelings, such as lack of motivation or mental illnesses.

Educational channels with an edge of controversy, or channels that often discuss political subjects are highlighted in orange. These channels often use negative words to express extreme opinions about these controversial topics.

Graph 17 — Average Negative Sentiment Score per Channel

Final Thoughts

Suggestions from the data pulled in this study:

  • If you are posting a video shorter than 10 minutes, aim for 6 or 7 minutes in length. Otherwise, aim for 10 or 15 minutes.
  • Post with a regular schedule. Preferably twice or once a week, but at least 1 a month. Avoid long periods without posting.
  • Publish your video on weekdays, and preferably on Wednesday or Tuesday. Extra tip: if you have enough videos posted, your best performing day might just be when your subscribed audience is more active. You can check that on your YouTube analytics page.
  • Aim to have between 15 to 25 tags in your videos. If your channel is very dependent on new product releases or news in overall, you may try adding more tags to cover more search options and tackle competition.
  • Strive to use titles with 44 characters or shorter in length if you can, and keep in the range of 30–60 characters. Avoid titles with over 70 characters at all costs.
  • You may use a “strong” negative word to increase click rate if it enhances the clarity of the topic of the video. I would only suggest this if the content actually validates the title.

My personal suggestions, taken from years of experience of improving several stats to clients in the Youtube platform:

  • Create a strong title with a good thumbnail even before recording the video. Then shape your video around what you originally planed.
  • Write a title that includes an Action and a Risk. Example: I ran 20 days without sleeping. Running is an action, and not sleeping for 20 days has a risk. This puts us as viewers in the position of wondering what is the outcome. So it is your job to create a video with an interesting outcome.
  • Explain your video very efficiently and directly on your intro, so that you validate your title and thumbnail. The people who click, just want to make sure they are at the right place. If you want to master a YouTube introduction, check this complementary study I did on that too.
  • Create a well segmented video and list the topics you will talk about, while supporting it with visuals.
  • Use text and sound to enhance the key words while you communicate in your video.
  • Try to use props, or give examples with anecdotes and real life situations, so that people relate better to the topic.
  • Study the hero’s journey and shape your video around that.
  • Watch this breakdown of Johnny Harry's video style. In my opinion he is one of the best storytellers out there, and this video clearly explains why.
  • For editors: try to add new elements on your video every 3–5 seconds. It can be a b-roll image, a different camera angle, a text, a sound, a zoom-cut. You name it.

Be original, be yourself, have fun 🤪

--

--