Identifying Key Metric Drivers in a Cross-Functional Setting
Introduction
Hi, we are Shengying and Maura from Glassdoor. As members of the decision science team at Glassdoor, our goal is to help our teams make informed decisions using data science. Recently we received a question from our stakeholders: What are the drivers of key performance indicators such as Net Promoter Score or Value Retention? Questions like these have potential implications for a broad set of stakeholders. In these cases, project management and communication are as important as analytical methods for the successful adoption of an analysis across the organization.
Business Question
Net Promoter Score (NPS), a measure of customer experience and loyalty, is a key performance indicator for Glassdoor’s business. Glassdoor regularly surveys all active customers asking them to rate their experience using Glassdoor products. Users rate their satisfaction on a scale from 0 to 10: 0–6 are labeled “detractors”, 7–8 are labeled “passives” and 9–10 are labeled “promoters”. We call it the raw NPS. NPS is the percent of users who are promoters, less the percent of our users who are detractors.
Another KPI closely tied to NPS is Value Retention (VR), or the revenue Glassdoor retains, loses or gains year-over-year. If customers have a good experience, they are more likely to retain and grow their relationship with Glassdoor. Many teams contribute to ensuring a good experience for our employers, from the software engineers building out new features to the Customer Success Managers who help our employers get the most out of their Glassdoor pages. To improve both our NPS and VR, we must work across teams to identify all potential drivers of these two related metrics, and then size and prioritize potential areas to improve customer experience.
Project Management
Before we write the first line of the code, we as data scientists need to identify the methodological approach that best answers our business question: “What drives a customer to be a promoter and grow with Glassdoor?” We anticipated testing a long list of potential drivers, and with a long list of stakeholders, the analysis needed to be insightful and actionable for a diverse group of job functions. Building out a process that includes all the necessary teams is just as important as building an accurate model.
- Kick-off & Scope: Talk through trends in NPS surveys, value retention reporting, etc. to identify stakeholders and use cases for results, and finalize the scope (time period, products SKUs, e.g.) with the core stakeholder team.
- Collect Hypotheses: Conduct interviews with the wide stakeholder group to understand which factors they think contribute to NPS or VR such as employer attributes or trends in the labor market.
- Translate to Data: Break down hypothesized drivers into variables we can test. Build the data dictionary of all the training variables specifying data source, time range, specific calculations to make sure everyone is on the same page and avoid miscommunication on any variable definition. Additionally, data science can provide valuable inputs to design metrics that stakeholders can take actions on.
- Build the Training Set: Develop the training data set in Hive, pull from any sources not already in Hive, like Salesforce reports.
- Variable Selection & Model Training: Test machine learning algorithms and assess model performance.
- Model Review: Present the preliminary results to stakeholders, start grouping predictive variables into domain areas.
- “What-if” Analysis: Finalize the model and using the trained model, run scenarios to identify the hypothetical lift in NPS or VR we could see if key drivers were changed.
- Present to Stakeholders: Review the key drivers identified by the model and their potential impact. Work with stakeholders to identify ways to build analysis results into teams’ workstreams.
As we work through this process, we ensure to stay in close contact with our stakeholders through email updates, sync meetings, and slack channels to ensure we take course corrective actions as soon as we notice instead of wasting time and energy on the wrong course.
Technical Implementation and Challenges
The biggest challenge when building out a training set for a model task is how best to represent real-world trends with data. For both NPS and VR, we run into the issue of translating an employer’s value into a predictable variable. To illustrate using NPS: the most obvious option is to use raw NPS which is an ordered numerical score from 0 to 10. This captures the most information about the score, but modeling an ordered numerical score produces model accuracy metrics that are difficult to explain to stakeholders without experience in model interpretation. We generally prefer to communicate model trade-offs using metrics such as accuracy, precision, and recall. These are fundamentally easier to understand for stakeholders than measures such as root mean squared error (RMSE).
Thus, we must transform NPS into binary variable(s) in order to train a classification model. We lose some information within each class, but it gives us several options on how to classify our NPS scores:
For NPS, we ultimately decide on Promoters vs. Detractors. We repeated this pro/con exercise for VR, and similarly bucketed our employers into those that grew their relationship with us, vs. those who did not.
Our second challenge was clearly communicating the predictors of the model and their impact on the key metrics. The best fit model based on overall accuracy and AUC was a random forest, which does not produce easily interpretable coefficients in the same way as a linear or logistic regression. We can rank the drivers by variable importance (as calculated by entropy reduction) produced by the model, but anticipating the size of the impact of increasing or decreasing a driver is more difficult.
Our solution is a “what-if” analysis leveraging the trained random forest model. We can simulate changes in a single variable at a time over a predetermined scale of possible values based on historical distribution, and let the model re-score employers across that value set. This shows us how a single variable leads to the change in likelihood to be a promoter or likelihood to grow the relationship with Glassdoor. Combining the variable importance with the what-if gives us both relative importance AND the potential impact of the driver. We can also look for the inflection point in the relationship between the input variable and VR, which becomes our “key threshold.”
One thing we keep in mind when relaying results like these to stakeholders is that we as data owners must have an intimate understanding of the patterns within the data and the model. For example, say our “what-if” analysis shows that a 10% increase in employee reviews predicts an increase in promoters. By analyzing the distribution of all of the variables in our training set, we know that employee review count rarely changes in a vacuum, so the chances of increasing only that variable are probably small. Furthermore, from looking at model results such as Shapley values, we know that on average increasing reviews increases NPS, but there will likely be cases where an employer increases their reviews but their NPS does not change or goes down. This helps prevent our stakeholders from assuming causal relationships. We constantly emphasize this: These models are not based on experiment data so we cannot draw causal conclusions!
Generating Insights with Stakeholders
When presenting the model and “what-if” results to stakeholders, we split the drivers identified into four groups: Sales & service experience, product engagement, job seeker engagement and customer & deal attributes. This gives a framework for cross-departmental stakeholders to keep results organized, and helps team-specific stakeholders immediately identify what they need to pay attention to most.
We also ensure we display results to stakeholders in multiple ways. For each driver, we have two main methods: 1. Results from the “what-if” analysis, 2. Simple histogram of the driver’s distribution compared to NPS/VR within those buckets.
“What-if” shows the relationship between the driver and NPS/VR holding all other variables equal. The second shows the potential relationship between the driver and NPS/VR without separating the driver from its relationship with the other covariates. By using both methods, we demonstrate to stakeholders both the hypothetical impact of focusing on that driver, and the relationship between the driver and NPS/VR in the real world. This gives stakeholders a holistic view of each driver and helps them size and prioritize opportunities to increase NPS/VR.
Conclusion
Analysis like this, identifying drivers for business outcomes, demonstrates how data scientists can use machine learning combined with good project management to produce results that help solve real business problems. The recipe for success is:
- Project management sets the foundation for good analysis. Focus on process tracking, documentation, and regular check-ins with stakeholders.
- Proactive communication with stakeholders, be empathetic to their business needs and pain points throughout the project.
- Be creative on how to translate model results into a concise and structured result set, so that stakeholders can easily understand and build action items from your analysis.
Following these key steps, machine learning for insight generation becomes so much easier.