The Guide to Advanced Reddimetrics
Congratulations! You now have access to the most comprehensive, state-of-the-art system for evaluating participants in the Trade Deadline Game (Click here to view or download it). Now, you’re probably thinking, “The fuck is this shit? I don’t care about it.”
And that’s OK, but for the six of you that do care and want to read on, I am going to explain a few things:
1.) Why advanced reddimetrics was developed.
2.) The underlying assumptions.
3.) How to use the database.
First, why? When I initially signed up to be a GM, I read through the general description for the TDG and thought it was a really fun concept. I looked at the section talking about how GMs choose who to trade and I realized that it was incomplete. The primary issue I had was that, while there are a number of obviously valuable power users with high karma, I believed more recently created accounts would be overlooked. They hadn’t been allowed the same amount of time to build Karma. I didn’t think it fairly portrayed the contributions of newer Redditors who might be just as, if not more active than others.
To sum up, I thought that we needed a better way to fairly compare users regardless of account age. I think I should mention, I was specifically inspired by a comment by /u/The_Nats_Of_Us about how he creates new accounts more frequently. I thought, “Some should try to fix that.”
Then I thought, “I need to eat some lunch.” I was really hungry. I also decided to help fix the problem. With some help from a friend to pull data from a really cool website, Snoopsnoo, I created an excel sheet and got to work on creating several metrics for users. I thought the best ones would be something along the lines of (1) how much karma users bring in on a daily basis and (2) how often are they submitting posts or making comments.
Now there are few assumptions made with the analysis:
1.) That users will continue to post and comment at a consistent rate. Yes, people do have surges of activity, but I would argue that it generally balances out. I would also argue that any increase in activity by users over the course of the TDG would be consistent across the board at scale.
2.) Minor differences in user behavior balance out with a large enough sample. Using the reddimetric system is the accept that generalizations are more accurate when given an abundance of user data. If we were dealing with five users, I would obviously have to disavow a statistical system to determine value. However, at near 1400 in the database, I feel more confident that advanced reddimetrics gives fairly accurate information on who is both active and can generate karma at a solid rate.
3.) The site from which I am pulling the data is accurate. I noticed rapidly that not all of the numbers always lined up perfectly. While I admit this can create doubt, I again argue that with 1400 users, the differences become statistically insignificant. A few users might get lost and undervalued in the shuffle, but on the whole, the data should be largely accurate.
4.) Users are active across all subs equally. This is not true. I know that people tend to spend more time on certain subreddits than others. While this may be the case, I believe that many people signing up for the TDG are more likely than not to be active in a baseball-related subreddit.
Those are the primary assumptions that you have to accept to use advanced reddimetrics. If you can’t get over them, uhh…. too bad? Like, shut up, I did this for fun and I was trying to be helpful. Don’t be a dick.
I guess this brings to the whole how-to-use section. Here’s what we have:
“Oh wow! Abbreviations and shit. It’s so professional and I don’t know what it means! The guy who did this is super cool and totally not a loser with too much time on his hands,” is probably what you’re thinking.
And you’re so right about the abbreviations. I’ll help you out:
sTPDA: How often a user posts.
cTPDA: How often a user comments.
sKPDA: How much karma a user gets from posts.
cKPDA: How much karma a user gets from comments.
tKPDA: How much karma a user gets in general.
*WARNING: INCOMING NERD SECTION. If you don’t want to read about how the statistical stuff I used, then skip to the next asterisk.*
Now, the numbers are obviously a bit weird. And there are some columns I’m hiding. I basically don’t care to go over those, but I’ll tell you the ones that mattered when I was developing the system. I first looked at the date from which the data for each user was taken. I calculated the total number of days, and got a general Karma per day number. However, when I first looked at the data, I had people with 2 posts and comments all the way up to users in the thousands. Same with the total karma per day, but the disparity was even more stark. So I needed a way to make the numbers close enough that they could mean something to people and would be easier to work with. I used logarithms to do so. I was worried that the data would be skewed, and the logs would make them less so. I calculated the logs of each user’s comment and post total, as well as their karma per day values, and subtracted them from a log of the sum of all comment, post, and karma totals. It was really, really fun.
*NERD SECTION OVER*
I designed the metrics to have positive and negative values with zero being close to average. So, a positive value would be good, and a negative value bad. When I looked at the numbers, for some of the metrics, zero was dead middle. For others, it was skewed one way or the other, because some people just don’t post ever or comment ever or are just really terrible at doing either of those things in spite of how often they do it.
But in general, the way I recommend you use the database is this: Figure out what you want. Do you want people who post more? Who comment more? Sort by that column. You could even unhide some of the columns and filter the users by those who have only made 200 total comments or posts to find active newer users. If you don’t know how to filter or sort, then look up how to do those things on your own. I’m not a fucking high school computer teacher.*
Anyhow, that’s how to use the advanced reddimetric database. I know a lot of you were worried about where you ranked compared to other redditors. Now, you are free to look, except I don’t know if you should look for yourself, since you might not like what you see. Also, there may be a few users missing, so feel free to let me know who is missing by sending a pm to /u/reddimetrics.
*No disrespect to high school computer teachers. They have to deal with a lot of really annoying students, and they are doing good work. But since I’m not getting paid to be one, then I don’t think I should have to act like one.