Google’s HEART framework — A Critical Evaluation
Recently, I’ve been consulting startups on building their roadmaps and executing on building out core sub-products, in particular those in the mobile AI space, or products built upon providing access to/utilising data.
I recently came across Google’s HEART framework. I enjoyed the process of evaluating it so much that I thought I would pen down my thoughts.
I’d be keen on reaching out to PMs to get their views on how they use it, if at all!
Thoughts on HEART
HEART stands for Happiness, Engagement, Adoption, Retention and Task Success. Traditional product analytics is usually metrics focused, and the function normally depends on gathering as much quantity of data as possible on net promoter scores, cost of acquisition, conversion rates, retention rates and so on.
For me, the most important thing in thinking about a data strategy for product is always asking yourself “If this metric was X instead of Y, would this cause me to rethink my product and redesign its UX?”. If the answer is no, then the gathering of the metric is not very useful and purely consumes precious resource. Each metric needs to be something that is actionable and answers a key concern of the business.
HEART is based on Goals rather than Concerns, and perhaps this is where HEART and I differ. HEART is forward-thinking and pre-product (analytics and the metrics you want to gather come before any redesign or brainstorming on what you should change), whereas for me an analytics framework should be about support, validation and checking of an existing product hypothesis (The product team has come up with a redesign we want to execute on, what metrics need to be tracked to test that the redesign is working?).
There are many benefits to each element of HEART. First I will go through each in turn, and then talk through what I see as the pros and cons of the Goals, Signals and Metrics component of HEART.
Happiness focuses on information points like user surveys, where users tell you how much they like the product. What I like about this is the qualitative component and its simplicity (would you tell your friends to use it). But there are a few weaknesses.
First, qualitative components are just yes or no (I like it or I don’t), really from a scale of 1 to 10. But it doesn’t tell you why. And I’d argue that that is the main thing. If your users don’t like the feature, what would they change, what’s the root reason for dissatisfaction? This is why metrics related to happiness find it difficult to go beyond the vanity metric classification, because they don’t add the complexity and nuance of gathering advanced, thoughtful feedback and product criticism.
Second, users are often not the best judge of whether your product feature has been a success (the famous “users don’t know what’s best” argument). For example, if your business needs are more about commercialisation and building a platform rather than pleasing everyone. Or, if users continuously complain about a feature that they are using a lot anyways, perhaps due to some other hook created elsewhere. The Facebook Like button redesign is a great example. Sure, overall a lot of people were unhappy about how it was very gimmicky. However, that feature allows Facebook to now gather a ton of more useful data about their users they never had. This could have been seen internally as a huge success for Facebook.
Third in relation to measuring Happiness, there is the argument that users might be happy just because they like your company rather than this feature being an incredible product in isolation. It is difficult to remove “supportive” bias from users just being happy in general vs. genuinely evaluating your product as being good or bad.
The Engagement part of HEART has a lot of strengths. User involvement is for me the best HEART component to see if people like your product (Engagement is a better proxy for Happiness). The fact that people are excited about a feature is key, and using the goals component of HEART, a team can judge what engagement means in terms of their peculiar niche or product specific use.
Engagement also has a huge amount of signals, beyond daily active users. For example, there might be ways to judge the engagement on specific sections or subsections or even buttons of a homepage rather than the app UX as a whole, unlike “happiness” related metrics.
From a metrics perspective, a product analyst needs to weight for time very intelligently. For example, how do you account for the natural churn and lack of engagement that happens with every new product as it released, and how can you weigh that factor out of metrics like “numbers of photos uploaded”. Two weeks in, naturally I might be using a product less than the first day I tried it. What is the mean stable level, and how can we re-weight and account for true engagement vs. factors you would expect anyways? These questions are key.
One would expect adoption to be part of any UX metric. However, the thing I think about as a product analyst when seeing this is accounting for seasonality, and accounting for pure “hype”. For example, most users just try a new feature because it is new, instead of having an inherent goal to adopt it because it is good. Similarly, the user on-boarding process that has been redesigned may force the new user to actually try the new feature, so numbers may be very high but this would be false. Hence it is important to understand the actual product and what constitutes success by understanding how the user flow works from a design pattern standpoint.
Retention is inextricably linked to Adoption — the yin to yang. Retention is important because it prevents features simply being developed as part of a product lifecycle, and allows product changes to permeate and shape the soul of a product for good. Tracking and making sure people continue to perform the actions they do is so key not just for product but paramount to the business too.
However, as with adoption and engagement, sometimes people just churn because it is natural based on the natural boredom that sets in from seeing the same UX or UI every day when using a product. Churn may also be purely due to a useless product, rather than the feature in isolation having been a good choice.
Task Success is for me the least important part of a UX analytics process. This is because it tends to be engineering driven, and not user driven. It is also something that should be better used to internal benchmarking of teams rather than for the UX of a product feature. Let’s take examples — how long it took for search results to appear, the seconds it took to upload a photo, the seconds it took in shaved time spent looking across apps.
This is obviously about improving the design and engineering of a product to be more “efficient”. But in my experience, the user doesn’t care. And sometimes less efficient products are just subjectively more fun and enjoyable to use. Task Success is a good way of seeing if the product is technically improving, but I feel less focused on UX. Sometimes, you may be delivering terrible, irrelevant search results, which don’t benefit the user at all, even if they take 1ms to arrive.
Goals, Signals, Metrics (“GSM”)
The purpose of GSM is to ensure that metrics are always produced and prioritised for the goals of a team for its own product. The signals part is there to make sure that any goals are watered down to what is achievable based on the various signals that can conceivably be gathered from trackers or page sensors. The aim of GSM is to allow a “natural prioritisation” of metrics. However, there are a few problems in this.
First, many product people talk about choosing the “one or two metrics that count”. Arguably there should be only one goal per letter in HEART, not multiple; 5 metrics to optimise for is already very steep. This emphasis on brainstorming and expanding the diamond can mean that HEART doesn’t account for interplay — for example, can you realistically increase H without decreasing A in the process?
In OKRs, you have one overarching company goal for the product or feature you are building, and this leads to the metrics you choose. I believe that a similar methodology could be a good variation of HEART to prevent a multiplicity of goals that may not coincide together very well.
Secondly, I feel that some elements of GSM have a lot of overlap and thus listing these down separately can cause confusion and unnecessary work. For example, what is the “goal” of happiness? Surely happiness is a goal in itself, as is engagement, retention etc? Similarly, what is the point of having a signal of retention in your app when you have a metric for retention that “signals” that your users are sticking in your product? Of course some metrics can be lagging indicators after signals, but normally metrics like weekly usage rates are both a signal and a metric too.
Thirdly, and related to my first point, the risk of allowing team members to list candidate signals can create too much breadth, and not enough focus. I believe that starting from understanding what data is realistically logged in the software as an afterthought means that a team can waste considerable time trying to gather data on things where building a system to store such things in a database can add developer resource for no reason. For example, I had the personal experience of asking one of our engineers to write a python script to obtain LinkedIn profiles from the FullContact API in order to be able to get a sense of the industry demographic of our users and the engagement per industry, as a proxy for profiling our most interested users. However, this data was never used, as we never had the time as a scrappy startup to use that data to our advantage.
I believe that being resource constrained as a product manager can be sometimes stronger than trying to find the most niche interesting metric possible. For that reason, I prefer the narrower approach to the Signal part of HEART.
Thoughts, ideas, disagreements, other views? I’d love to hear them on @dhruvghulati on Twitter!
Please recommend if you enjoyed this piece :)