Utilizing Clustering Techniques to Classify Customer Usage of Tech Platforms

Published in

FactSet

8 min readFeb 5, 2024

A key aspect to any successful business is understanding your customers. Who are they? What products do they use? Is their engagement increasing or decreasing over time? Identifying accurate metrics to answer these questions can be exceptionally challenging, especially as the variety of users and products grows.

The core challenge is creating an adaptable framework that easily scales to many possible user and product combinations. This is typically the case for large corporations such as financial data providers, which may serve users ranging from Lawyers and Journalists to Wealth Managers and Quantitative Analysts. The tools a user interacts with strongly correlates to the institution they work for and their role.

A logical desire is to create a score to represent the overall health of each user, where 0 represents a non-active user and 1 represents a power user. Tracking such a score over clients, regions, and time provides tremendous insight into the overall health of the customer, identifying areas of growth and areas where increased attention is needed to maintain the customer base. To achieve this, the 0 to 1 range of the score would be the same for all users, and each class of user is ranked only against her peers. After all, the products an active lawyer uses would be drastically different than those a power user hedge fund quantitative analyst would use.

In this article, we demonstrate one way you can implement this power score metric in a modular and automated framework that you can easily scale to any subset of clients and products.

Formulation

To track user engagement with a product, first collect the underlying usage. Typically, usage is represented by the number of interactions performed in an application, or even simpler, the number of clicks or hits. More sophisticated logging frameworks keep track of the click patterns and relate them to application feature usage. This is valuable data, but ultimately reduces to the dataset of <date, user, product, hits>. Depending on the application, you can use hourly usage, which is more granular than date. You can also divide individual products into <product, feature>. As we show below, you can address these specializations, but to simplify we assume that all users have <date, user, product, hits> as our dataset, and that users not engaging with a product on any given day have 0 hits.

The exact number of hits doesn’t really matter, and the desired numbers may be different across products and user classes. Comparing one product with 100,000 hits to another with 10 is misleading, because applications have different usage expectations. For example, we can consider hits as the amount of data pulled from a database, or the number of documents read. Comparing these raw numbers is meaningless. Comparisons only make sense for relative usage of two users on the same product. But even here, we may expect a journalist to read more articles than a hedge fund quantitative analyst. Therefore, the final power score only evaluates a user against his peers in the same user class.

Finally, we need to identify any “false hits”, for example pop-up notifications or automated updates, which may confuse usage data. We recommend removing this data from the analysis, as it does not result from user interaction.

Given this set up, it is now possible to review each user class and identify products most critical to that role. We can also identify the kind of usage to observe from a power user. Reviewing each client under this framework, we build our overall view of the business. However, be mindful that this manual procedure can only work for small companies, because it requires a detailed understanding of the business and is tedious to update with new products or further separation between user classes.

So how would we take an automation approach to this concept?

Creating an Automated Power Score

The first step towards an automated power score is to create the usage level, a simple score for each <user class, product, time-period> triplet. This score ranges from 0 to 1, where a 1 means that a user consistently engages with the product over a given time period, but also engages significantly during each interaction. So, product usage score is a scaled output of a magnitude score and a frequency score.

The magnitude score may account for a large range in the number of hits, so we standardize the usage data into deciles based upon product and peer group. As a result, when studying engagement over the course of the month for a given <user class, product> pair, we focus on if usage is on the higher or lower end of the spectrum, and the exact numbers and scales are irrelevant.

The frequency score provides a full picture of usage by accounting for the percentage of days a product is used within a calendar month. Alternatively, you can take a rolling 30-day approach. However, this requires daily computation and upkeep costs that grow rapidly for a metric that doesn’t change much. A monthly computation simplifies framework maintenance. Also, standardize this frequency metric relative to the user’s respective peer group to better represent the level of frequency.

To Summarize, once the data is prepared, we evaluate the user engagement monthly, computing the magnitude and frequency of usage. Over a month period, the magnitude tells us if the user heavily used the product, while frequency tells us if the user steadily returns to the product. As both these two scores range from 0 to 1, you can multiply them to obtain a single product scaled to 1.

Also, when we consider product engagement, we use a secondary measure to weigh the individual products within the group based upon other attributes such as product price. This provides proper representation of business context in the case of a user actively using premium features with low interaction on others.

For each <user, time-period> we organize the scores of the product groupings into a vector format which can be fed into a clustering algorithm, which enables us to classify the user’s relative usage engagement. However, as the number of products increases and becomes too large, the clustering technique will run into issues. Chief among them, as the dimensionality of the data increases the space becomes sparse, making it hard to find well-defined groupings. To account for this issue, we embed the vectors using a projection technique UMAP (Uniform Manifold Approximation and Projection). For each point, UMAP identifies the neighboring points as well as relative distances to a subset of other points in the higher dimension. This information acts as loose constraints, and the location of each point is optimized so that in two dimensions the general distances between points are maintained. UMAP simplifies the relationships between large vectors and projects them into smaller dimensions that function as our new coordinates. This simplification allows the K-means clustering algorithm to identify much tighter and consistent groupings within the data.

The produced clusters allow you to compare constituent members of peer groups and obtain a usage classification. In the heatmap visualization above, we see a clear difference between cluster 3 with no usage, and cluster 0 of power usage. We calculate a single vector of usage scores from each cluster by taking the average of each product in the vector from each member of the cluster. We then compare each cluster’s usage vectors and rank them based upon magnitude of usage. After clustering, we produce a classification of any subscriber by taking the Euclidean distance between the individual’s usage and the cluster’s usage to find the closest cluster. The 2-dimensional UMAP projection above represents a simplification of the relative distance between users with the x and y axis representing the algorithm generated coordinates of the user’s simplified vectors. Each data point is colored to indicate which cluster it belongs to which aligns to the designations on the heatmap. A steady transition pattern is shown from cluster to cluster with zero and three being placed on opposite ends.

To maintain this system, calculate the classifications based upon the observation time window on a regular schedule. Beyond the scope of this post, be aware that peer groups can change over time, as classification definitions change, or certain groups become a smaller part of your business. Peer groups must have enough data points to produce reliable results. As the peer group populations drift, reclassify during regular maintenance. Also, it’s important to create a monitoring system that evaluates the model’s performance changes over time. Accomplish this by creating a breakdown of descriptive statistics on the model classifications. If the usage for top tier classifications is not clearly separate from the usage of the lowest tier, the model may need re-evaluation. Potential protection from model drift are parametrized guard rails which catch and reassign outlier classifications. Additional feedback from business stakeholders is essential to better mold the model to the needs of your company. Optimizing the classifications requires an iterative process of result validation, critique, and re-evaluation.

The benefit of this approach is that it provides a simple way to represent and categorize users on a spectrum of non-engaged to power users. Additionally, the generated heatmaps provide a glimpse into how users transition from one category to the next and which products are most important within each category. Leverage this to make better recommendations to sales teams. Furthermore, as more users and data become available, expand this methodology to treat US and European clients differently and have a systematic method to compare the two regions. Finally, this approach helps monitor client engagement through time, identifying periods of increased and decreased engagement.

Author: Jack Deegan (Data Scientist)

Contributor: Yuri Malitsky (Senior Director, Engineering)

Reviewers: Nick Waleszonia (Director, Process Engineering) & Josh Gaddy (VP, Director, Developer Advocacy)

Utilizing Clustering Techniques to Classify Customer Usage of Tech Platforms

Written by FactSet