A Community Manager Guide to Using Statistical Methods for Insights
I remember going to a CMX Summit and watching Evan Hamilton speak. He did a presentation on measuring the ROI of your community. In his talk, Evan used pictures of kittens to keep things entertaining. I remember a feeling of anger over it.
Why do we need kittens?
I get upset at things like this because it encourages the stigma around math and science. That you have to be a genius to like it. Math is valuable in and of itself and carries its own beauty.
There is a lot to gain from even minor uses of mathematics in your daily work. That is what I want to talk about today. How to make important decisions about your community. Using simple statistical techniques which you learned in High School.
I recognize that this article is not about traditional KPI. We will instead be working with raw data collected manually. I have yet to encounter a community platform which gives strong analytic insights. Most focus on vanity metrics which are easy to track. They don’t give us much information that dictates a course of action.
You will be learning techniques which will give you concrete answers . They will tell you how behavior impacts your community.
Here is why there will be kitties.
I know why we need kittens.
There are many people who experience emotional and physical pain over mathematics.. A research study from the University of Chicago used FMRI to understand math anxiety. They looked at areas of the brain involved. They saw that areas associated with pain were active. Even when subjects anticipated doing mathematics. They didn’t actually have to do the math because the anticipation was enough.
The current theory is that math anxiety chokes working memory. Making the problems more difficult to solve. This has nothing to do with math ability. Those with math anxiety solve problems as correctly as those who don’t have a phobia.
Many people falsely believe that I have a natural talent or ability because I like math. That it comes easily to me. The truth is far starker. I hated mathematics as a child. I flunked my fourth-grade math class and had to repeat it in 5th grade. It was this year which I discovered that I could play games with math. That has helped me learn new concepts ever since.
It has taken me considerable time and effort to learn the techniques which I want to discuss with you today. I cannot promise that you will love mathematics the way that I do now. I do think these methods will improve your work as a community manager. It will help you answer questions about the health of your community and its value.
So let’s dive in.
It’s all about your methodologies.
When I was in school at Humboldt State, I took classes taught by William Reynolds. You may recognize that name if you have ever taken courses in Psychology. He is most well known for his childhood depression scale. It is still a standard used today for diagnosis in adolescents. He’s a big deal.
The most challenging course he taught was methodology. This is the class where you learn how to design research studies. This is different than statistics where you are learning hypothesis testing. Methodology is about how you set up your experiment so that it answers your questions.
There are things that we need to keep in mind.
- The metrics we measure must relate to the outcome we want to observe.
- They must measure what we say they measure and not explained in other ways
- They must be complete such that important factors are not missing.
- What we decide to measure will impact what we observe.
- What we choose to measure supports our conclusions.
- Repeated measures should have the same outcome.
- Researchers should be able to replicate your results using the same methods.
- Research methods should not be subjective (based on one person’s point of view.)
- Outcomes should not differ between different observers.
- Those doing observations should agree on what is being measured.
Validity and reliability are important, but much more so in experimental design. When we are looking for causality between independent variables and dependent variables. No community manager should expect to run an experiment in the way a scientist would.
There is an important reason why.
We work with biased data. The population we sample from is self-selected. They are the people who joined our communities by choice. The people who visited our website and made a purchase. We are not dealing with a representative sample of the general population. Any statistical method we use will be less robust, but still consequential.
Descriptive statistics tell us what is important.
Descriptive statistics are what they sound like, they describe a data set. The ones that I use most often are the measures of central tendency; mean, median and mode.
The mean is what you typically think of as an average. You add up all your numbers and divide by how many there are.
The mode is the most common number in the data set. There is only a mode if the number repeats. Sometimes it doesn’t exist.
The median is one of the most useful tools we have for analysis! The big issue with the type of data that we collect is variability. This makes the mean and mode less useful for understanding what is going on. The median really shines showing us just how this variability is impacting our data set.
The median itself is a baseline, our average. We look for outliers once we know the value of the median. We then calculate the upper and lower limit. This data set covers reach and engagement for each update to the TSR Facebook page in May. Outliers are in green.
Those updates in green performed better than our average in a significant way. Let’s take a look at them.
It makes sense that this update would be a top performer. First of all, it’s a photo which the Facebook algorithm seems to prioritize. It features members of an academic club at a school where the students learn how to run RPGs. It’s a heartwarming feel-good story.
This update was popular because it was a product announcement. We want to see something like this in our outliers. It means it got more than the average attention and we need that to push sales.
Try this same analysis in a forum. Look at individual threads and measure their engagement. For example, you track all the replies to support threads in a month. Find the median to determine the average length of a thread. Then look for outliers. What do they show you about your community?
Here is a great tutorial on this subject, “Introduction to Statistics by Kahn Academy.”
Chart your growth and make predictions.
This is the first instance where we will use a scatter plot to make some predictions based on our data set. I typically collect the raw numbers for this. For example, this graph shows us the growth of likes for TSR’s Facebook page over the course of May.
The vertical axis is our number of likes. The horizontal axis is the day of the month. Each dot represents the total number of likes for that day. As you see, when plotted this way the dots are in a linear pattern.
We approximate that with a best-fit line through the dots.
There is something you should see right away while looking at this graph. The slope of the best-fit line is going up and to the right. This means that our page likes are growing in the positive direction or we say that likes are increasing. .
There are two other types of best-fit lines. If the line were close to horizontal, this would mean that there is no rate of change. Things have plateaued or stalled. The best-fit line may also slope down and to the right. This would mean that page likes were decreasing or unliking the page was increasing. The slope gives you a sense of how things are going. If the slope is steep, then things are going quickly. It is going slowly if the line is more horizontal
That funny looking equation represents our best-fit line as a function. There are two things about this to point out.
- The slope of that equation is our rate of growth.
In this example, that is about 2.84 likes a day.
It’s the number in front of the x.
2. Assume the rate of change continues, you can use this equation to predict future growth.
How long will it take us to reach 10,000 likes?
316 = x
In about 316 days.
This is also a tool that you could use in your forum. For example, plotting the growth of the community members by day for a month.
Discover relationships and trends in your data.
This requires a more complex data set. I have performed a content audit of the TSR blog, “Multiverse.” These numbers cover each article for May.
We will use this data to create more scatter plots, but this time they will show two metrics on the same graph. This type of analysis is a correlation. It is important because it tells us how related two metrics are. If they have a relationship.
It does not tell us the cause of that relationship. We often say that correlation is not causation. This means that we can make no claims about the direction of a relationship between two metrics. Yet, this analysis still tells us a lot about what is going on in our community.
Let’s take a look at some of these charts.
This should look familiar. It is another scatter plot, but now we have two dimensions. We have plotted total unique views vs. total shares. We want to answer the question, “Do posts with more shares get more unique views?” It could also be that more unique views lead to greater sharing. We cannot be sure of which way the relationship goes.
We do have an answer though. Look at the slope of that line. It tells us that there is a positive relationship between our two metrics. This means they both increase together.. The slope of the line is not very steep and this tells us that it is not a strong relationship.
The slope of the line here is almost horizontal. This means that the length of a post has no impact on how many views that post receives. That is a good thing. Readers aren’t focused on word count.
This method is also applicable to those who work with a forum. You could plot views of a thread by how many replies there are in that thread. This would tell you if your community members care about how long a thread is.
Continuous data is valuable.
It is nice when you have a source which allows you to collect continuous data. Continuous just means that what we are measuring can take on any value. I look at this when writing articles for the TSR blog “Multiverse.” I look for trends for a single article over the course of three months. I gather data at weekly intervals. I then plot the results on a graph. Here is an example of an article optimized for the keyword phrase “board game upgrades.”
You’ll notice that for the first month, it only received about 20 views. Then there is about a threefold increase over the last two months. Continuous data shows me that this post was not well received when first published. Yet, it attained a good ranking in Google after that first month. Views increased due to organic traffic.
This is a good check on your content strategy. You could easily check pageviews on a thread on a weekly basis. Or select your most popular threads and track them over time. You could also change the metric to something like replies. Determine what is most important for you to measure.
The Key Takeaways
- You don’t have to be a math prodigy to use these techniques. It is scary. Know that you are capable even when you feel the fear.
- Community platforms don’t always give us insights which lead to decision making. Instead, pull raw numbers manually into a spreadsheet and use the chart tools.
- Unlike traditional KPIs, these methods give us the ability to make predictions,
- When performing a correlation the results are not a cause and effect relationship. It only shows the relationship of two metrics but not the direction of that relationship.
- Think of your questions first, and then determine what you need to measure. A strong approach will lead to a more robust analysis. This will help you make decisions about your community.
You’ll be as happy as a sloth in a tree.
Thanks for following along if you have been. Leave a comment below!