How we’re bringing a holistic data experience to our customers and products

Unity in Scotland
Unity Life
Published in
6 min readMay 30, 2022

My name is Keir, and I am a Senior Software Engineer at Unity.

A little while ago I wrote an intro for Team Mesa, where our focus was around Player Engagement tooling using microservice architecture to create a new Push Notification System for the Unity Gaming Services (UGS) suite. Since then, part of the Mesa team have joined forces with some of our Engineers based in the US and Canada, creating Team Rocket. In this blog we’ll cover our new focus and what we’re working on.

A common request both internally at Unity, and externally with our customers is to have a view of combined data from all of the UGS products. Our customers can look at their combined data and metrics to derive new insights into their game between Unity products, such as a combined revenue between Analytics IAP’s and Ad Revenue. Unity can also use that data to create predictions/insights, e.g. to suggest a campaign to improve player retention rates. None of our competitors currently offer a holistic view of data between their game related services, therefore addressing this request would enable developers and live ops managers new insights, thus allowing them to make better decisions for their games!

My team has been tasked with driving a solution to start enabling ways to solve this , and we’re tackling it in two separate pieces.

Data Gateway API

The first part is creating a consistent and simple way for all teams within Unity to fetch data from any underlying data source. The data needs to be easily transformed into a consistent format to enable our front-end engineers to display their data within the Unity Dashboard with minimal engineering effort.

To achieve this, we’re building out the Data Gateway with pluggable downstream services called Data Interfaces, that return data to the Data Gateway adhering to an Open API 3 spec.

This allows the Data Gateway to then return that data to the Front End irrespective of the downstream data source type or data client implementation specifics (ie; JDBC vs some client library; like BigTable or BigQuery).

The Data Gateway is responsible for formatting downstream data to match Chart definitions created by engineers and managers. These Charts are made up of multiple Plot objects, and each plot corresponds to a Query object. The Data Gateway additionally holds configuration that determines which Data Interface to forward an incoming API request to. Once the data comes back from the Data Interface, it can transform that data into its chart definition.

The Chart object is flexible, in that the Data returned is not directly tied to visual concepts of displaying this data in the Front End, but the response does contain Headers of which the Front End can make use of to simplify rendering that data. These Headers hint at some visualisation aspects,such as axis names, plot names, and the data type for the axes.

It’s worth noting that the Data Gateway doesn’t know anything about the underlying query that the Data Interfaces run; it’s only aware of the queryId and the url of the Data Interface, and the Data Interface knows about the Query. This is done to keep the domain of responsibility clean, meaning that queries can be changed and modified without needing to make any changes to the Data Gateway.

Another advantage of this separation is that we are able to fetch data from different Data Interfaces, and display that data in a single chart/visualisation, meaning you can see new insights and correlations of your data.

Data Interface API

Teams can independently integrate with our Data Gateway, as long as the data returned to the Data Gateway matches the OpenAPI3 definition that the Data Gateway is expecting. Our API spec between the Data Gateway and Data Interfaces must be flexible enough so that:

  1. Engineers aren’t confined to response structures that don’t interact well with their data
  2. To provide consistency so that the Data Gateway can interpret and transform the data from the responses into a uniform format for the frontend client to easily interact with

By separating out the responsibilities between the Data Gateway and the Data Interfaces, the Data Interface concept empowers teams to attach and fetch from any underlying data source, whether it’s SQL based, NOSQL based, flat storage based, and so on.

Whilst the Data Gateway is a single, simple way for the Front End to ask for that data and consistently display it within the Unity Dashboard, regardless of the underlying data interactions.

Unifying the Data

The other half of the problem is unifying data within Unity, to allow us to start joining data between different data stores and data sets. To achieve this, we are identifying “joining points” between the data across different data domains, and then building out queries that represent a unified metric.

For unifying data, we’re making use of Snowflake and the Snowflake engine. Snowflake is used widely across Unity, however not all data exists within Snowflake, and part of the problem we’re also helping to solve is ways to help teams get their data into Snowflake to enable these powerful, cross domain queries.

Snowflake offers some great out of the box solutions for ingesting data from various sources. One of my team’s most used ingestion approaches is reading line delimited JSON data from a GCS bucket via a Snowflake Stream, deconstructing the JSON into a stage area, and then mapping from that stage area into a columnar table. Snowflake is really performant in this approach and it allows us to load hundreds of thousands of events per second.

The current approach is for teams and organisations within Unity to ingest their data into their own Snowflake accounts, so that they remain in control of their data domain and we do not interact with their data that they use for business-as-usual operations in any way. The interaction we have is all via Data Shares within Snowflake.

The Unified Store describes a “Data Schema” that the other data domains must adhere to so that we can build out Views within our Unified domain. Once the data is available within the Unified domain via the Data Shares, we can build out complex queries between many shares and generate a nice simple View on top of all of them. We then run queries via the Data Gateway and Data Interface to fetch the Unified Metrics and render the data back in the Unity Dashboard.

However, once the data is available in Snowflake over the Data Shares, there are more puzzles to solve, such as levels of aggregation (hourly, daily, weekly…) between data sets that operate at different viewpoints. We have to consider normalisation of the data between the data sets so that the final value is accurate and correct. Lastly, we need to make sure that all of these queries are performant, and that they don’t take an eternity to run when the Front End clients ask for their Metrics.

Finally

I’m really excited to soon see our Unified Metrics in the Unity Dashboard, helping developers and managers have a new, first of its kind, holistic overview of all their data between Unity Gaming Services products!

On the Data Gateway side, I’m also really excited to get internal teams onboarded into the Data Gateway, so we can start simplifying their engineering efforts to render their metrics and data in uDash.

As the Data Gateway & Data Interfaces grow to support many different data sources, and we enable self-serve onboarding with different teams within Unity, I’m sure we’ll face some new challenges around observability of downstream services and automatically handling increases in request scale. I’m looking forward to tackling those challenges and continuing to grow the Data Gateway adoption within Unity.

Members of Team Rocket, Caithan, Daniel, Keir (me), Da and Ada!

--

--