Video Game Sales

How hard can it get?

Hazel Tan
Hazel Tan
Nov 7 · 6 min read

After making fervent fans of Final Fantasy wait for 4 years, Square Enix has finally confirmed the release of Final Fantasy VII Remake at Tokyo Game Show 2019 — March 2020! Following the initial announcement of the remake at E3 2015, the long wait has made fans impatient and raised the expectations of the game even higher — even a ‘commoner’ like me was determined to try the game. But why Final Fantasy VII? Other instalments of Final Fantasy saw re-releases on newer platforms but some have yet to receive any remakes, notably the first 6 instalments of Final Fantasy. Maybe Square Enix has a criteria for picking which game to remake or re-release.

Concurrently, in a course that I am reading in University, I was tasked to practice data visualisation. I chanced upon a data set from Kaggle that compiled the Video Game Sales from over 1900 games and included their Japan, US, Europe, Other and Global sales. It immediately came to my attention that data of Final Fantasy sales were documented. Making a visualisation out of this data could deepen my understanding of Final Fantasy. On top of just using the data given, I plotted the global sales of each instalment of Final Fantasy against the ratings abstracted from Metacritic.

Consequently, this was my graph:

Fig. 1. 1 Chart of the global sales vs ratings of the different Final Fantasy instalments

I also compiled the data of the availability of Final Fantasy Games on various consoles:

Fig. 1.2 Availability of the different Final Fantasy instalments on different consoles

Hmm, looks good so far. A brief look at the graph would indicate that the sales could be a reason — The first 6 instalments generated much lower sales than the later instalments. However, there were some anomalies like Final Fantasy VI, which had high ratings but low sales.

Without much considerations, I then posted it on Reddit under the r/FinalFantasy community. And the next time I checked the post, I received much criticism from the community. It hurt, but I took a step back to review the constructive criticism.

Age of data

In my eagerness to harness the data into something meaningful, I made a critical mistake in failing to check when the data was collected or last updated. Upon reviewing the data set, I realised that the data was collected 3 years ago. Given the franchise is still ongoing (with Final Fantasy VII Remake set to release in March 2020 and Final Fantasy XVI already in development), the community would expect the data presented to be current. However, the data set I used was highly outdated and thus, the ‘irrelevance’ of the data was highly pointed out by readers on Reddit. It was quite obvious since the latest instalment, Final Fantasy XV, was clearly missing from the chart.

Knowing your data

An important but not-so-easy-to-execute step of visualising data. Knowing the data you’re using is also imperative. As a data-user, it is quite difficult as the data was not personally collected. However, research should have be done to better understand the numbers I’m working with.

Firstly, redditors pointed out the inaccuracy of the data. Statistics such as global sales are hard to verify due to the difficulty in attaining current data. For older games released before 2005, up-to-date sales are inconvenient to find. Furthermore, it is also unclear as to whether the data accounts for inflation. Each game is likely to have varied in price and the global sales may not be representative of the performance of the game. Instead, the total units sold could be used as an indicator instead to reduce such inaccuracies.

Secondly, understanding the field, not just the data set, is crucial to be able to draw valid insights to the topic. While Final Fantasy has traditionally been released on consoles, Square Enix has also started making their games available on other platforms, such as PC and mobile phones. In particular, majority of the US market purchasing Final Fantasy VI on mobile platforms instead of console was one of the reasons for the drastic difference in ratings and sales. Square Enix has also re-released some of their instalments on PC and has tried to break into the PC market by launching FFXIV primarily on PC in 2010. The data set and, consequently, the graph plotted left out this paradigm shift in the franchise’s availability on various platforms. This also reinforces the importance of the age of the data set, as the data set does not account for this.

Cleaning your data

Final Fantasy is one of the biggest franchises of all of game history. It’s expected that classifying the games into suitable categories was a challenge. Due to the scale of the franchise and each individual instalments, I decided to tabulate the sales across all different consoles on the data set. This was to make the graph less cluttered. For example, the graph grouped Final Fantasy X and X-2 together. I grouped Final Fantasy XIII, XIII-2 and Lightning Returns: Final Fantasy XIII similarly.

Grouping these games together under the same instalment seemed appropriate, but the difference between the games in each instalment made it an unwise choice. While the games are under the same instalment, they were not released simultaneously and had different ratings from each other, illustrated above. Furthermore, Square Enix also re-released older games on different platforms. With Final Fantasy X and X-2 as a prime example, it was re-released on PS4 as a HD Remastered version and the overall rating for the game was 85%, lower than the initial ratings. Breaking them into separate games may have been a better option.

Fig 1.3 Game Ratings of the sub-instalments of Final Fantasy X and XIII

However, this leads to a point of contention. Final Fantasy VI has a total of 11 games nested under the same instalment. Separating the instalment into 11 different games would make the graph cluttered and difficult to read. Additionally, each game generated relatively lower global sales, potentially making the graph visually disproportionate. In this situation, breaking up only the bigger and more popular instalments could have been a better approach. It should be made on a case-by-case basis. In this case, the Reddit community is more aware of these instalments and would view such separation a better representation of the performance of the franchise. In other cases, better understanding of your audience is necessary to determine the way to classify your data.

Knowing your Audience

This brings me to the point on knowing your audience. The graph was posted on Reddit, specifically on the r/FinalFantasy community. This community is rather niche, as compared to others like r/Singapore. Members in this community are likely to be well-versed in the franchise and be more critical of the information provided to them. This could come in the form of disagreeing with the ratings or providing opinions about the accuracy of the data set. As a non-expert, posting in this community before thoroughly understanding the dynamics of the audience and the Final Fantasy franchise was a mistake. Instead of contributing to the community, it would just be swept under the carpet: what seemed meaningful to the creator (me) was not perceived similarly by the consumers (the Reddit Community).

The niche nature of Final Fantasy enforces the need to truly understand the breadth and depth of it. Only then can the graph be potentially insightful and useful to gamers to actively follow this franchise.

Conclusion

Data analysis in the form of data visualisation can only go so far in understanding a topic better. There are undoubtedly benefits — it makes conceptualising correlation, and possibly causation, more intuitive. Attempting to use this chart to generate some discussions could have been insightful in seeing simple correlation between ratings and global sales.

However, on the flip side, there are also limitations in relying solely on data visualisation to become an ‘expert’ on a subject. Assuming the data set is reliable, it still misses out on certain dimensions of the data. The multi-dimensional nature of the data made it difficult to compress all of it into a single simple chart. Furthermore, considering aesthetics to make the information more digestible can hinder the depth of data presented.

Being an ‘expert’ on such a niche area does not constitute just being able to generate charts from random data sets plucked off the net. With so much data online, harnessing the data with firm understanding on the history, potential confounders and current developments is imperative. Being an ‘expert’ is being able to make use of this knowledge to generate more meaning to the data and presenting relations and causations that have not been explicitly outlined before.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade