Navigating SKAdNetwork: Build Your Game’s Conversion Schema in 4 Easy Steps

Khanh Nguyen
Homa Engineering
Published in
7 min readApr 10, 2024

Context

As Apple intensifies its efforts to safeguard user privacy, the challenge for game developers and publishers to understand in-game behavior of their acquired users escalates.

In response, Apple’s own SKAdNetwork (SKAN) attribution service enables publishers to track user activities from their marketing campaigns, albeit with reduced detail to protect user privacy.

Consider the impact on data granularity: instead of reporting exact user LTV like $0.68, SKAN provides a conversion value between 1 and 63. Publishers can then map each conversion value to specific LTV ranges*, creating what’s known as the SKAN conversion value schema.

Once the schema is defined:

  • As the user plays the game, if their LTV falls within the range of a specific conversion value, this conversion value would be sent to Apple.
  • Once this conversion value is returned from Apple, the publisher could use the predefined schema to convert it back to LTV — typically the LTV at the middle of the corresponding range.
  • As a result, the LTV for each user is now an estimated value of their true LTV. Hence, the goal is to define a schema so that the estimated LTV is as close as possible to the real LTV of each user.

In the rest of the blog post below, I’ll show how you can create a SKAN conversion schema for your own game. This approach mirrors what we’ve successfully implemented at Homa, achieving an impressive 98% accuracy in estimating user LTV on iOS.

SKAN schema in image is not representative of the schema used at Homa

* Each conversion value can also be mapped to an in-app event, but mapping conversion value to LTV (typically LTV D0) is still quite common for games

Create SKAN schema from user LTV

To do this, you can start by obtaining LTV for each user in your game, from tools like Google Firebase or from your own Marketing Measurement Partner like Adjust or Appsflyer. For this example, I will randomly generate 100,000 user LTV from a Pareto distribution, which matches quite well with how user LTV tends to distribute for mobile games.

List of 100,000 randomly-generated user LTV

Plotting the total revenue generated from each LTV, we see that LTV around $2 contributes the largest revenue.

Distribution of revenue for each LTV

This brings us to the central principle when defining the LTV range for each conversion value:

The LTV range should be narrow around LTV that generates high revenue, so that LTV in this range are estimated as accurately as possible

For this distribution, then, the LTV range should be narrow around $2: say, $1.9 to $2, rather than $1.4 to $2.6. Thus, every LTV between $1.9 to $2 will be estimated with a value of $1.95 (the middle of the range), which in turns matches quite well with the true LTV.

A straightforward way to satisfy this principle is by dividing the revenue distribution into 63 equal areas, one for each conversion value. This ensures that LTV that generates high revenue (large height) will have a narrow range (small width), while LTV that generates low revenue (small height) will have a wide range (large width).

Each conversion value = 1.6% of total area under the revenue curve

Mathematically, this means that the area for each conversion value will occupy 100%/63 = 1.6% of the area under the revenue curve, with:

  • Conversion value 1: 0% to 1.6% of total area (from the left)
  • Conversion value 2: 1.6% to 3.2% of total area, and so on …
  • Conversion value 63: 98.4% to 100% of total area

Showcase SKAN schema creation with code

Implementing this strategy turns out to be quite simple, which I will demonstrate using SQL code for the following steps:

1.Total how much revenue are generated for each value of LTV. For simplicity, each LTV is rounded to 1 decimal place, resulting in 376 unique LTV values and their corresponding revenue. This is much smaller compared to the list of 100,000 user-level LTV that we started with, which allows us to quickly compute our conversion value schema even when scaled to millions of users.

revenue_by_ltv: Calculate total revenue for each LTV

2. Calculate the revenue percentile for each LTV i.e. how many percent of our total revenue is contributed by values at or below the given LTV. This is done by calculating the cumulative revenue for each LTV, and divide it by the total revenue.

revenue_percentiles: Calculate revenue percentile for each LTV

3. Convert revenue percentiles to conversion values by dividing each percentile by 1.6% (100/63). This will tell us how many multiples of the “1.6% of total area” that percentile of the LTV will have, which corresponds to the conversion value this LTV belongs to.

conversion_value_by_ltv: Convert each revenue percentile into conversion value

4.Find the average LTV for each conversion value. This is done by first finding the minimum & maximum LTV for each conversion value, from which the average LTV is taken as the middle between these two values. This average LTV can then be used to convert each conversion value back to its corresponding LTV.

conversion_value_schema: Find maximum, minimum & average LTV for each conversion value

Evaluate accuracy of SKAN schema

To evaluate the accuracy of this conversion value schema, we can apply it to a new set of 100,000 user LTV randomly generated from the same Pareto distribution. This is done via the following steps:

  1. Convert each LTV into a corresponding conversion value if the LTV falls within the LTV range of that conversion value. If the LTV is higher than the maximum LTV across all ranges ($94 in this example), it will be converted to the highest conversion value of 63, given that SKAN reports the highest conversion value available for each user.
  2. Convert the conversion value back to the average LTV of that conversion value.
  3. Calculate the difference between the true LTV (from step 1) and the average LTV (from step 2). This is the estimation error of user LTV from the SKAN schema.

Turns out, our schema can estimate the user LTV extremely well: it is around 97% accurate when predicting LTV at user-level, and more than 99% accurate overall!

Evaluate user-level & overall accuracy of schema on new users (100,000 randomly-generated LTV)

At Homa, we found that our own schema is 98% accurate at the campaign and country level, where most UA decisions are made. More surprisingly, it remains accurate even several months after the schema has been rolled out to all our games!

Comparison with existing methods

Surprisingly, a quick Google search of “SKAdNetwork conversion value schema” returns only a handful of articles on setting up your own schema, such as from Google Firebase or Facebook Gaming.

Existing methods use user quantiles rather than revenue quantiles to construct the schema

However, most of these guides construct the schema using user percentiles rather than revenue percentiles. Under these methods, the 1st conversion value will contain the 1.6% of users with the lowest LTV, compared with 1.6% of revenue with the lowest LTV in our approach. However, the LTV of these users will be very small, and contribute much less than 1.6% of total revenue.

These methods tend to assign overly precise LTV range for low-LTV users & too imprecise LTV range for high-LTV users

This means that LTV ranges from these methods are overly narrow and precise for low-LTV users, while much too wide and imprecise for high-LTV users (see above image). In fact, using this method in our simulated data results in a schema that is only 75% accurate!

Potential improvements of the schema

Here are some potential improvements that you could make on the above method when creating the SKAN conversion value schema for your game:

  • Remove users with extremely high LTV that could bias the LTV ranges, especially at the upper end of conversion values.
  • Use more intelligent LTV averages rather than the middle between minimum and maximum LTV of the range. This could be done simply be finding the average LTV of the users in each conversion value.
  • Monitor how well the schema estimates user LTV over time, and update the schema if it no longer estimates well your user LTV. However, beware of a few days of unstable SKAN data after rolling out a new schema.

Create SKAN schema for your own game

You can use the Deepnote notebook shown in this blog post in order to generate the conversion value schema for your own game:

  1. Save your user LTV into a CSV file called ltv_train.csv and upload it to the Deepnote workspace (replace the existing file). This will be used to calculate the SKAN conversion value schema for your game.
  2. (Optional) Save another set of user LTV into a CSV file called ltv_test.csv and upload it to the Deepnote workspace (replace the existing file). This will be used to evaluate the SKAN conversion value schema that you created.
  3. Run the code in the Build conversion value schema from user-level LTV notebook. If you have not completed step 2, please feel free to only run part A of the notebook. Once the schema has been built, download the conversion_value_schema.csv file, which contains the generated converaion value schema of your game.

Conclusion

Wrapping up, this guide simplifies the use of Apple’s SKAdNetwork, making marketing analytics on iOS more accessible for developers and publishers alike.

At Homa, we pride ourselves on pushing the boundaries of technology and data within the gaming industry, and we’re always on the lookout for individuals who share our passion for innovation. If you’re eager to work on solving complex tech problems and contributing to groundbreaking projects, while working with a highly motivated and experienced team, we invite you to take a look at our open roles.

Visit our website to apply and learn more about our exciting tech opportunities.

--

--