Why your company needs standardized metrics to accelerate development of successful features

Aaron Powers
athenahealth design
4 min readOct 24, 2018

by Aaron Powers, Sr. Manager of Experience Measurement, Allison LaValley, Director of Product Operations, & David Drollette, VP Of Product Analytics

athenahealth provides network-enabled services for healthcare and point-of-care mobile apps to drive clinical and financial results for its hospital and ambulatory clients

Executive Summary

The goals of establishing (and monitoring) metrics and KPIs are to predict whether a feature will be successful when it’s released to our users, to identify how a feature can be improved once it is released, and to potentially quantify the scope of impact from those improvements. Using metrics which are sharable, repeatable, and statistically validated speeds up product development and gives teams information to help focus us on the most promising opportunities. Creating metrics takes time, which often leads to their being truncated in the interested of speed and making further exploration of the “why” behind performance more challenging. Standardizing metrics helps us share them across features, making them reusable and comparable.

Why should you use standardized metrics?

When organizations standardize a set of metrics, they become re-usable — the same phrase means the same exact calculation to everyone in the organization. Standardized metrics offer lower cost, comparability, statistical validity, and more.

Homegrown metric development for individual features is costly — each time you develop a metric for a single feature, it can cost person-weeks, and it can apply to as little as one feature. When you build standardized metrics that can be used by multiple teams, you can multiply your intelligence and efforts. Let’s assume you want to track metrics for 66 features in a given release development cycle. If it takes 2 person-weeks to develop and maintain a homegrown metric, but 10 person-weeks for a more reliable, standard metric, and if the standard metric can be applied to a conservative 40 features, you have already saved over 1 person-year on the first standard metric.

Cost reduction has another advantage. Since your company has a limited number of people working on metrics, when you spend time on homegrown metrics you’ll only be able to create a few metrics. However, a single metric is never enough to tell a full story: for example, if your total number of users went down after a change, and that was your only metric, you’d be disappointed. But what if the average user satisfaction is higher, total revenue has increased, and the median revenue-per-user has gone up? We might have fewer customers but more of the right customers with a higher profit margin. Telling this complete story took 5 metrics, not 1; reliance on a single metric can be misleading. By standardizing metrics & reusing them, we can spend more time building a suite of metrics that tell a story rather than trying focusing on just a few metrics that are incomplete.

Homegrown metrics can’t be compared to different homegrown metrics — even if two teams both measure “adoption”, that one thing can be calculated in hundreds of different ways. Standard metrics are easy to compare between teams, and you can engage in a discussion about what differences mean in terms to our users instead of engaging in a conversation about why the metrics might be calculated differently. Numbers are most valuable when compared to other numbers. “$5 million saved” sounds great, but saving $10 million would be even better. Comparing two numbers helps make a better decision than a single number alone.

Let’s look at it from the perspective of surveys — while it’s very easy to write your own survey (you can build one in minutes!), it’s not as easy to write statistically useful survey questions and get clearly actionable results. Pew Research writes that “slight modifications in question wording can affect responses.” Merely swapping the order of two questions in a survey can change responses significantly. For example, in public perception polls, when asked about George W. Bush’s approval rate first responses to another question changed by 10%. We found that to be true even in our company-wide Net Promoter Score. In User NPS, swapping two questions changed our NPS score by 5 points. Similarly, our biggest ever change in NPS happened when we switched from asking via phone to email. In other words, small changes in how you measure things can easily be larger than the statistical effect you’re trying to research. Independent research shows that standardized metrics offer more statistical reliability than homegrown ones. Statisticians could talk for days about these kinds of effects. Careful design and standardized measurement can eliminate all of these types of problems.

If you liked this article, you may also find our related articles in this series valuable on your journey to measuring the user experience: Getting questions about the ROI of UX? 3 ways to refocus the conversation to opportunities, Establishing a “Design Quality” metric to build design credibility, and Applying Machine Learning To User Research: 6 Machine Learning Methods To Yield User Experience Insights.

--

--