Being a Data Analyst in a Game Developer
Contribute to give your team a boost
Data are just summaries of thousands of stories — tell a few of those stories to help make the data meaningful.
Analysis of data is a vital part of running a successful business. When data is used effectively, it leads to better understanding of a business’s previous performance and better decision-making for its future activities. There are many ways that data can be utilized, at all levels of a company’s operations.
Gaming industry has had success before introducing game analytics. With the rise of mobile games, the amount of possible gathered data increased tremendously, compared to the console games which dominated the past.
When it comes to data, like any other app or online source you can gather data from actions a user do, or even doesn’t do, directly from your app, or not directly from other sources. The next step would be using this data to analyze and understand your contribution as a data analyst.
The truth is, not every stakeholder possesses enough data literacy or understand the potential of the data you have. As the gatekeeper of the data, you need to be able to guide them even from before touching the data.
1) Define your questions and goals
Please, do not follow your stakeholders’ requests blindly!
In data analysis, you must begin with the right question(s). Questions should be measurable, clear and concise. Design your questions to either qualify or disqualify potential solutions to your specific problem or opportunity. In this context of gaming industry, as a data analyst we are typically working with Product Managers, Game Designers, or Marketing that have certain goals in sight. Ask them, “what are your objective in knowing this?”.
For example, start with a clearly defined problem: A game is experiencing a declining number of paying users. One of many questions to solve this problem might include: How to increase the conversion rate without compromising revenue? Then again, a product manager may have their own strategy or objective: How to acquire more new users? And game designers also may have their own strategy or objective: What kind of in-app purchase is more appealing?. However, in a fast-paced setting, you are almost certainly do not have the time, and a misguided analysis would definitely be a waste of resource.
You should have basic understanding of the data you monitor and how the metrics interact with each other. The first thing you can do to give your team a boost is to derive the problem into different root causes. Then, you can provide different scenarios as to which plan has the greatest impact, the heaviest effort, or practically impossible at current state of data availability.
For the example mentioned above, the first thing you could do as a data analyst is to check the validity of the data. Is the number of paying users really declining, or is it just an error in the data? If it is an error, you just avoided a whole load of misguided analysis if you didn’t know any better! If it is true, then you could do some general check up of the game, is there any technical issue? Is there any changes in the game, possibly from an update, that might make the UX different? If there isn’t any, you can finally start looking into other metrics to find where the bottleneck is. After that, you can communicate with your team where do you think you should do a more thorough analysis.
2) Set Clear Measurement Priorities
This step breaks down into two sub-steps:
A) Decide What To Measure
Using the declining paying user example, consider what kind of data you would need to answer your key question. In this case, you probably need to know the number of daily active users, paying users, retention, the percentage of time they spend in store, which item are more likely to be purchased, and how engaged the users are. In answering this question, you’re likely need to answer many sub-questions (e.g., Is the game currently not balanced that the users are already overpowered without premium items? If so, what process improvements would help?). Finally, in your decision on what to measure, be sure to include any reasonable objections any stakeholders might have (e.g., If the game is made harder, how would the majority of users respond to surges in difficulty?).
B) Decide How To Measure It
Thinking about how you measure your data is just as important, especially before the data collection phase, because your measuring process either backs up or discredits your analysis later on. Key questions to ask for this step include:
- What is your time frame? (e.g., annual versus quarterly)
- What is your unit of measure? (e.g., USD versus EUR)
- What other factors should be included? (e.g., first item users bought that made them repeat buyers)
3) Collect Data
With your question clearly defined and your measurement priorities set, now it’s time to collect your data. As you collect and organize your data, remember to keep these important points in mind:
- Before you collect new data, determine what information could be collected from existing databases or sources on hand. Collect this data first.
- Determine a file storing and naming system ahead of time to help all tasked team members collaborate. This process saves time and prevents team members from collecting the same information twice.
- If you need to gather additional data, then develop a template ahead of time to ensure consistency and save time.
- Keep your collected data organized and add any source notes as you go (including any data normalization performed). This practice validates your conclusions down the road.
4) Data Cleansing
Now whatever data is collected may not be useful or irrelevant to your aim of analysis, hence it should be cleansed. The data which is collected may contain duplicate records, white spaces or errors. The data should be cleaned and error free. This process must be done before analysis because based on data cleansing, your output of analysis will be closer to your expected outcome.
A solid data cleansing approach should satisfy a number of requirements:
- Detection and removal of all major errors and inconsistencies in data either dealing with a single source or while integrating multiple sources.
- Correcting of mismatches and ensuring that columns are in the same order while also checking that the data is in the same format (such as date and currency).
- Enriching or improving data by merging in additional information (such as adding data to assets detail by combining data from Purchasing, Sales and Marketing databases) if required.
- Data cleaning should not be performed in isolation but together with schema-related data transformations based on comprehensive metadata.
- Mapping functions for data cleaning should be specified in a declarative way and be reusable for other data sources as well as for query processing.
5) Analyze Data
After you’ve collected the right data to answer your question from Step 1, it’s time for deeper data analysis. Begin by manipulating your data in a number of different ways, such as plotting it out and finding correlations or by sorting and filtering data with different variables, cohort, or segmentations.
Most of the time, cohort analysis would be really helpful to check the performance of our game. We can do this by breaking down the user into smaller categories. It could be the install date, demographic, or any similar behavior. In Fig 1., we are able to pinpoint that there is an anomaly in Feb 2 cohort, which has a significantly lower D1 retention.
Other than that, sometimes we need to breakdown with more than one cohort. In Fig 2., I am grouping the user based on their spending into free, minnow, dolphin, and whale. I also group them based on their days before uninstall. From that, we can see that so many free users uninstall the game in Day 0.
As you manipulate data, you may find that you have the exact data you need, but it is also likely that you have to revise your original question or collect more data. Either way, this initial analysis of trends, correlations, variations and outliers helps you focus your data analysis on better answering your question and any objections others might have.
6) Interpret Results
After analyzing your data and possibly conducting further research, it’s finally time to interpret your results. As you interpret your analysis, keep in mind that you cannot ever prove a hypothesis true: rather, you can only fail to reject the hypothesis. Meaning that no matter how much data you collect, chance could always interfere with your results.
As you interpret the results of your data, ask yourself these key questions:
- Does the data answer your original question? How?
- Does the data help you defend against any objections? How?
- Are there any limitation on your conclusions, any angles you haven’t considered?
If your interpretation of the data holds up under all of these questions and considerations, then you likely have come to a productive conclusion. The only remaining step is to use the results of your data analysis process to decide your best course of action.
Conclusion
It is important to keep in mind that once you have a large amount of data, you need to train yourself to start seeing patterns. Game analytics, or in fact any analytics should work the other way around. Start by exploring your current situation of acquisition, engagement and monetization and then buildup your plan of analysis. Any analysis you do must be tied to your business goals, and support or trigger an action on your games.
Feel free to leave any comment below, the intention of this writing is mostly sharing lesson learned and ideas from working in gaming industry. However, some of them might come in handy for other industries, too!
If this article was useful to you, please share it with your colleagues and friends. Have fun with data!
You can check out my other article here: