We got lots of inquires on how we predict the demographics data for YouTube influencers, and how accurate it is. And I do notice that there are some other platforms claiming that they can predict the demographics data as well. So let me do some explanation/comparison here.
We know that with YouTuber’s authorization, we can get the official demographics from YouTube Data APIs, and this data is supposed to be accurate. Let’s first compare the gender distribution: (I have to hide the youtuber account for privacy). In the below image, left side is data returned from youtube API, right side is the data we calculated.
I purposely picked some YouTuber whose audience is mainly male (≥80%), some YouTuber whose audience is mainly female(≥90%), and some YouTuber who has almost an even split between the male and female. As you can see from the above, the gender estimation is pretty accurate. The average accuracy is around 95%.
Now let’s go ahead and check country distribution.
Actual: AR:18.01%,CO: 7.12%,MX:39.61%,ES 6.43%,CL 10.1%
Predict: AR: 20%, CO: 11%, MX 31%, ES, 3%, CL: 11%
As you can see, some countries are pretty accurate, some might has errors, but the ranking remains accurate: MX > AR > CL ≥ CO > ES, which is important because when you are trying to find an YouTube to work with, you want to know where their audience is mainly located. Let us just check another group:
Actual: US 47.14%, DE: 7,65%, GB:5.06%
Predict: US: 50%, DE:9%: GB:5%
Again, there are some small errors, but the relative ranking is accurate.
Now comes the difficult part: age distribution. It is difficult to estimate the age based on NLP. So we have to grab the face images from video/avatar. But again, estimate age based on your pretty face is also hard, especially for ladies with makeup.(P.S. here comes those fancy words like TensorFlow, AI, Deep learning etc). From our experiments, we found that we consistently has a less accurate number on “age13–17” group, and wrongly categorize age13–17 to age18–24 group:
But other than that, it still looks amazingly accurate, right? Here is another one:
It feels like we can almost immediately correct the age13–17 group, but it’s not as easy as it seems. But again, if we combine the two groups into 13–24, it becomes very very accurate.
I randomly checked another platform that claims they can predict the YouTube demographics as well. But I have to say, we are way way ahead of them. I thought about putting their data here, but decide not to. If you are interested, just shoot me a message.
In addition, we also have many interesting data like audience interest distribution, gender distribution within each age group, sponsored video performance, brands mentioned etc. If you want to try out, check out the world’s first real-time influencer performance analytical engine at SocialBook.
SocialBook is a DAPP build upon BOOSTO, an influencer driven decentralized DApp store. Follow us on
Facebook, Twitter, Instagram, Reddit and LinkedIn Or join our Telegram Community
Join the blockchain community and become part of something great!