Notification System Design (99+)

This is a post about the different challenges that come along with working on notification systems from a product perspective. I’m going to overload the term “notifications” in this post, but what I mean is: in-product notifications, push notifications, email, and system generated text messages. My experience in this space has been from spending some time working on Quora’s notifications, Quora’s feed, and Email (Gmail / Inbox). I’ve been wanting to write this post for a long time (almost two years actually) because I think notification systems are a unique and interesting problem space that most designers don’t get a chance to work on, especially at scale. In addition, I think notifications are an incredibly powerful tool for a product person to wield that often get underused or abused to maximize short term gains.

What are notifications good at?

Before I go into what makes it challenging to work on notifications, I want to make it clear why they are worth working on in the first place. A notification is the product communicating with you while you are not using it. It is a naturally interruptive and invasive experience to various degrees. Because of that it is a very consequential system, meaning that every thing you send through it will have material impact on the user’s experience with your product. Compared to something like a feed where the user is in greater control over when they read it, how fast they scroll through it, what they do with each item in the feed, etc. So given that, notifications are effective at these things:

  • Engagement: I cannot overstate how effective notifications are at getting someone to use your product. Every notification regardless of the intended purpose will likely lead to more engagement, but there are many examples of notifications with the explicit goal of pulling you back to the product. These are essentially advertisements. For example, any digest email. One common property of a notification that has an explicit engagement goal is that they don’t need to be sent, meaning that the user doesn’t necessarily have any expectation that they will come. This is what makes them powerful and dangerous. Most people have experienced some abuse of this by some app who has wielded this for some sort of short term gain. “Happy Valentine’s day, we love you, come check out our app today!”
  • Transaction Awareness: This is what I would consider most notifications to be. Something happened in the system that you as a user expect to be made aware of. For example: “Someone followed you”, or “someone likes your post.”
  • Communication: An actual person is trying to communicate with you in some way, whether it’s in a chat or a mention, etc.

Quality and personalization

I group quality and personalization together because not all notifications are the same and not all people are the same. There is a whole spectrum of “types” of users, from new user to power user, that not only will have a different tolerance for volume of notifications but will experience different “types” of notifications. For example on Twitter, I suspect that most new users aren’t subscribing to other people and experiencing the onslaught of notifications that that creates. On the flip side, I suspect that most power users of Twitter aren’t getting prompted to follow more users. Even within these cohorts, different people will have different preferences for what they want to see and what feels appropriate. This is all so sensitive due to the invasiveness of notifications as a mechanic. I do believe if you can hold quality high and have an interaction set that allows for dealing with large volumes of notifications, you can send someone a ton of notifications without them feeling upset. Chat apps are a good example of this. If you are talking with someone you want to talk to, you actually start to experience anxiety when they aren’t making your phone buzz!
 
In a perfect world, your product could have hundreds of different notifications and the system would know exactly what notifications to send to each person, and it would send just the right amount, and it would send them at just the right time of day, and it would even know that in the afternoon they like push notifications on their phone but in the morning they like receiving emails. No system I know of has gotten this personalized, not because it’s not possible (you could achieve this with ML) but because given the complexity of something like this, the ROI is probably just not there. 
 
When I’ve thought about this matrix of problems I’ve focused on four scenarios:
 
Vacation: This is when someone doesn’t use your product for some meaningful amount of time and when they return, there are so many unread notifications that it feels so overwhelming that they become disengaged. Two common solutions to this problem are what I’ll call “Instant clear” and “Batch clearing.” “Instant clear” is when you enter into the space and your notification unread count is immediately reset to zero. This resets the sense of feeling overwhelmed but it mostly works when you have a large amount of “Transaction” notification types where the user can skim them and get what they might need without addressing each one individually. When you have a product where each notification requires a user’s heavy consideration, you often see “Batch clearing.” These products often have a lot of “Communication” notifications. Google Inbox leverages this a lot. 
 
Volume (Verified and Power Users): A verified user is someone who has enough status in the real world that they get an enormous amount of response on anything they do regardless of the quality of what they do. Verified users tend to get a large amount of notifications for all kinds of reasons. New comments, messages, follows, likes, etc. Anything they do gets a lot of attention. If you do nothing special for this type user, it looks like this:

Power users, like Verified Users, will often end up getting a lot of notifications, almost certainly not at the same scale but still in the top 1% of users. The difference between them is their tolerance for volume and quality because of their relationship to the product. As I mentioned before, quality is entirely subjective in this space. A power user, for example, might want to know every time any person likes their content because they are so deeply engaged with the product and the community that they want that experience.

Both of these users suffer from volume overload in various ways. A common solution to this problem is aggregation of notifications. So if you got 1,000 new followers, instead of sending you 1,000 new notifications, the product would just say that you got 1,000 new followers. There are two problems with this: What if the receiver of the notification actually knows some subset of that 1,000 followers? And, when do we actually start aggregating? Do we wait until some heuristic is met like there are 1,000 they haven’t seen? Or is it just about whenever they last logged in? 
 
Both of these benefit greatly from some determination of affinity between the receiver and the user on the other end of the notification. This could be accomplished by: some heuristics you write, some score where you set weights for different attributes, or some sort of ML solution personalized to each user. If you want to experience more complexity, you might do this slightly differently for different notifications! (We do this a lot at Quora).
 
Lack Of Signal (New User): New users are just getting their feet wet in the product. They are probably not fully engaged and probably don’t fully understand how and why to use the product. I would guess that on average new users don’t receive as many notifications in any product as they could. Often this is due to lack of signal about who they are, what they want, and who they know in the network. They also just don’t take many actions so they aren’t able to see many transaction or communication notifications. Engagement notifications are best used for this cohort, if you can justify it with something high quality.

Channel selection

In this context, channel means where and degree of invasiveness. The challenge here is when you know you should send a user some sort of notification, how do you know where it makes the most sense to send it? Should it be an email, should it be in the product, should it just badge the app but not push, should it push but make no sound, should it buzz/make a sound on their phone, should you call the user, or should it be in something entirely inconsequential like a feed? I consider this challenge to be largely unsolved. In many cases, there is a 1:1 relationship between what happens in a product and when your app badges, there is a large overlap between what happens there and what emails are sent, and then for push notifications it feels like every product just has different heuristics that they use. When you have to design the settings for these things it all get exposed to the user how hedgy these decisions often are. You either end up with a small set of extremely vague settings, or you end up with a overwhelming display of different toggles in an attempt to give the user some sense of control.

Measuring success & sentiment

How do you know if a notification you added was the right thing for the product and your users? If you really want to do this well, it becomes extremely difficult for a few reasons.
 
More → More
In my experience, every notification I’ve ever added has increased engagement. In addition, whenever I’ve tried sending a notification more frequently, I’ve increased engagement. Sounds great right? Most people have probably experienced a notification that appears to be driven by this line of thinking. Intuitively this probably feels like a bad path to you. Especially if you have ever been on the other side of this as a user. There are a couple of potential problems with this. First of all, you could eventually just burn people out. It might be great for a while but then eventually the user might burnout and delete your app. Another problem is that users might engage with your notification for many reasons. They could swipe open the push notification because they can’t wait to see what it is, or they could just be using that as a quick entry point into the app so they can go to the settings page to figure out how to turn it off. It’s easy to fall into a trap where you are using short term engagement metrics and given how many products do, it seems that people either don’t care, aren’t thinking deeply about this, or aren’t actually seeing any consequences. 
 
 Negative Signal Lags
 If you intuitively agree that More → More eventually is going to have some sort of negative outcome, then you might start to think about how you would measure that. Some ways that I’m aware of are:

  • Measure major bad outcomes like app deletion or email unsubscribe
  • Capture signal with UI in some way. For example, the ability to report a notification, saying you don’t like it. (You see this a lot with advertisements).
  • User research
  • Engagement metrics (Nothing is absolute, they could go down)

The problem with these things is that often this signal is lagging and takes a long time to collect. For example someone could delete your app because they are fed up with your notifications after they received one too many. Or maybe they decided to delete but forgot, then some other notification that was totally fine reminded them to delete your app. 
 
Explicit negative signal collection is always going to be sparse and hard to rely on.
 
User research is fantastic but resource intensive and could always have the same problem as explicit signal collection.
 
Positive Signal is Inconsistent
In a product like Quora, there are many different types of notifications that serve many different purposes. This means that it can be really hard to know if a user liked the notification they got or even engaged with it at all. For example, if a user gets a notification that someone followed them, is a successful outcome that the user followed the other person back, clicked on their profile, or just saw the notification and felt cool? Another type of notification would be one where there is an explicit action attached. On Quora, someone might ask you to answer a question. Success there would mean that that user responds to that request.
 
 I don’t actually know how other companies have thought about these problems. I suspect that they do a combination of all the things I mentioned. In cases like this where there is a ton of complexity, it’s often best to use something simple that you think is going to be the best proxy for sentiment, and then use user research to sanity check things every so often. I suspect that most companies are just using CTR (click through rate) and maintaining some minimum required metric to justify the existence of an individual notification, and assuming that that is enough to indicate a healthy system. A more advanced version of this would include a global measurement of the users overal CTR to prevent volume burnout rather than just looking at it per notif. Part of balancing a notifications system I think is just being aware of the traps.

Operating within an OS

When I’m referring to an OS I mean any super system that a user is using to manage your notifications and all their other ones. Examples would be iOS, Android, Chrome, Email. This is going to feel somewhat recursive but I’ll try and focus on what makes an OS different, at least right now.
 
The biggest challenge when working within an OS is that it doesn’t really care about your app. It doesn’t care about engagement at all. An OS is largely a utility to enter in and out of various engaging experiences. What this means is that something like iOS is going to be a big bottleneck on what you can do and how smart you can be with your notifications. The problems that operating systems have to solve are related to volume management, and the only way they have really attempted to solve that is via interaction design. You see lots of aggregation and grouping by app with batch clearing. Ultimately though all the notifications are time ordered and have the same visual weight. Notifications from a game telling you they have a 20% off sale on their micro-transaction could be pushing down a missed call from your best friend.
 
I want to be careful because I don’t actually know what an OS should do. I don’t fully understand the constraints, particularly if there are any legal ones, but I can speculate how things could work. The problem is they don’t display any sort of bias towards the user’s personal preferences. Imagine if iOS learned that Quora was your favorite app and so it started to give priority to your Quora notifications over any other app. That would be great for Quora and bad for all the other apps that you have installed. Technically you could go now and turn off all notifications for everything but Quora, but that’s what I mean by them solving this problem via interaction design. You can make that interaction as easy as you want but ultimately it’s work/curation and deduced personalization will always scale better. The problem is that this additional layer of complexity could easily end up hurting your app even more. In email this happens a lot where some email owner who sends lots of emails is suddenly getting flagged as spam and now is not getting the CTR that they are used to. Maybe this is right, or maybe it’s not. Ultimately you don’t have control. Even showing up in different Gmail tabs (social, promotions, updates) can matter a lot. If I were designing an OS I would probably try something in this space, but as the owner of a product that exists in an OS i’m not sure I want it.
 
One OS/platform I want to call out in particular is Voice (Google Home, Siri, Alexa, etc). I was fortunate enough to get to design how Quora works in the voice format and found their lack of a notification type experience to be a major limit on the upside of the platform. As voice exists today it only has a unidirectional relationship with the user, which means that the only experiences available are initiated by the user. If you consider voice to just be a node in the constellation of devices a user is expected to be plugged into at all times (phone, computer, watch, etc) then perhaps this doesn’t matter. If you expect voice to stand on its own in any meaningful way, this is something that they will need to figure out.

Conclusion

Notification systems are complex but powerful tools that products can leverage. I shared some of the major axes of challenges I have faced while thinking about notification systems. Everything gets even more complicated when they start to intersect and need to tradeoff against each other. There is so much upside both in system design and end user experience that I think we have yet to reach as an industry. I hope we will see increased leveraging of ML and perhaps some paradigm shifts in how different OS’s handle notifications that will unlock all kinds of new experiences in the next few years.

Show your support

Clapping shows how much you appreciated Henry Modisett’s story.