How to successfully execute a redesign of your service to improve your business.
“The new design is great, the numbers just don’t reflect this” is not a reason to ship a new design.
I am not a designer. I cannot draw a straight line to save my life, and my color matching sense peaked at the era of the amber-colored VT50 terminal. This is an article about the pitfalls of rolling out a new design for your service, and suggestions to improve the process. I’ve been through a few redesigns, saw a few more, and talked to other companies about how they handle it. I decided to write this post because I could not find a description of a good process to follow anywhere.
This process may not be relevant to you if you have very few users, don’t measure / care about actions on your site, or are significantly changing your core offering rather than making an incremental change to drive growth.
If you use your app or website to convert customers and if actions taken there directly drive your business, read on.
How you think it will go
- The team works on a new design, does usability testing, and gets it blessed by the CEO. CEOs (or very senior execs) will tend to get involved in these redesigns since this is the brand of the company staring users in the face.
- You estimate the work and execute according to plan (1–2 months?).
- You start rolling out, hopefully as an A/B test. You roll out to 5%, then within 2 weeks see success and roll out to 100% of users.
- The rollout is widely successful and your business soars. You party inside the company while critics rave about your new design.
How it will actually go
That was a nice dream. Meanwhile in the land of reality, things have progressed differently.
- Feasibility: The design didn’t take into account technical complexity and feasibility, and had to change. Until you tried connecting real data in real-world conditions to the screens the engineers are building, no one discovered it won’t work. Maybe it requires too much computation to do within the 0.5 second constraint you want to keep for starting the app. Maybe the network connection isn’t great for sending all this data down to the device in a timely fashion. Maybe technology just doesn’t support finding a new AR surface as the user twills the phone around and around. Reality is a downer.
- Complexity: You significantly underestimated the interdependency of the code and the old design, or the amount of coordination needed between teams to handle the redesign. I won’t pursue this one in this article since it’s a common enough case in project management.
- Testing: You discover you don’t have a great testing strategy for this new design. In fact no one remembers every feature that exists in your product to even get a good test plan in place. Again, out of scope for this article.
- Office vs. World : Rolling out to 5% breaks the product for those 5% because of some case that doesn’t happen on the corporate network. Hopefully you have a kill switch and can roll back to 0%.
- Too many factors: After fixing the main issue and this time retesting with 1% to make sure it works, you roll out to 5% again. The numbers don’t look good. No one is quite sure why.
- Analytics: Trying to analyze what’s happening, you discover you did not instrument the critical flows in the app, or that your conversion event is called “user_purchase” in the old design and “user_checkout” in the new design, and no one knew since you didn’t test this. You do another release to fix analytics, then wait to collect data.
- The obvious solution: Most people think the problem is doodad A. You bet that’s the problem, fix it, release again, and the numbers still suck.
- The pressure cooker: At this point, 3 teams have built features on your new redesign and not the old design since you told them New Shiny Thing is coming out soon. They can’t get user exposure since you’re at 5% still. Pressure mounts!
- Fatigue: You do another last-ditch effort to fix the issue. Another release goes by but the numbers still look bad.
- Lies, damn lies, and statistics: The CEO makes an executive decision to roll this out because a) the redesign is great and “the numbers don’t really reflect that” or b) those 3 other teams need to get their features out. You all agree to keep working on improving the design after it’s released.
- The new normal: The new design is rolled out to 100% of users and a new, lower, business baseline is established. No one is happy but everyone was so tired of this that it’s a blessing.
A blueprint for redesign
Successfully rolling out a redesign is hard, but there are actions you can take to make it more predictable.
The planning phase
- Acknowledge that you might fail. This is hard to do. No one starts a project thinking it might fail. Like with any other feature, call out to the team upfront that the redesign needs to improve something for the users and the business, or else it won’t ship.
- Have a really clear goal in mind for the redesign: “We are redesigning in order to XYZ.” Know how to measure XYZ. At any step of the process, as design changes, features start working, etc, ask yourself “Is this still doing XYZ?” It’s very easy to lose the goal you started with as the project evolves.
For example, if the goal of the redesign is to allow direct navigation from other parts of the app, continue validating the flow actually achieves this. If it isn’t, have a really good reason for why you’re continuing. You might have discovered a new goal, or you might just be continuing because of inertia.
- Bring design, engineering, analytics, QA, and other functions together as early as possible in the process. The institutional knowledge and the different skills on the team will help you avoid very costly assumptions in the design early on.
- It’s very easy to lump changing brand colors, navigation paradigms, and new features along with updated flows. In fact internal sentiment will typically assume that this is A Better Design(™). You’ll have done usability studies and showed that this new design is so much better. What could go wrong?
In reality, users tend to behave differently than you think. They might not like the color, or they think of the service differently than you do. Your numbers are likely to go down once you start testing, and you will not be able to tell why. Usability studies will give you directional ideas, and tell you if anything major is broken, but while you ask “how would you do X”, users might not have that in mind when they go to your service.
Push hard on these two questions:
- Is the entire design really needed, or can you isolate the key change and start with that?
- Can you test any of the new design components inside the old design? This will tell you if you’re even on the right track.
- Don’t keep this process secret. Some companies will hide the design from people inside the company either from fear of criticism or from fear of it leaking. Criticism is good — it gives you points of view that you haven’t considered. Leaking is a real possibility, though trusting people usually gets you trust back. Leaks might not be as big a risk as you might imagine. Your competitors don’t really know why you’re doing the redesign, and will typically wait to see what reactions you get outside. If you’re really worried, you can always leak a fake design and watch them not do anything.
- When estimating the work, allow for time to experiment. Assume you might need to do 5–10 additional sequential experiments. Multiply your release cycle + data gathering time by the number of cycles to get a more realistic timeframe. For example, if you release your app once a week and need 2 weeks to gather data, your shortest cycle is 3 weeks (assuming you’re very fast at implementing changes). Multiply that by 5–10 cycles. Yes, that’s long!
A major redesign can take 3–6 months or longer. Get your executives onboard. If they think this is happening in 2 months, you’ll start getting into the forced release decision stage.
- Define how you’ll measure success, i.e. what metric has to improve, and what metric must not drop; the rest of your metrics might go up or down. The new design should improve something (for example the number of orders you get) while not harming other metrics (e.g. user signups). Other metrics will change, since complex systems are hard to predict. Be ok with non critical metrics dropping if you’re seeing a meaningful improvement in critical places. See the misc section below for more on metrics going down.
Some companies define success as a successful improvement in one specific market; they fix conversion in the rest of the world after shipping. For example the redesign might improve US conversions, but drop conversions in the rest of the world. This is valid when your primary business is in that market and that’s where you’re expected to add value. It’s also valid since different cultures react differently to redesigns, and finding a universal design is not a trivial task. In this case, local-focused teams can keep improving the design for specific markets after the launch of the primary redesign.
Pro-tip: Be careful of compound metrics. Orders per user can go up if users drop. Make sure to check both before declaring success.
- Figure out your user testing strategy. You’ll want to minimize user impact, but still have enough users to get statistically significant results within a relevant timeframe. Your data scientists and analysts can help you figure this out. For example 5% exposure (2.5% test / 2.5% control) might give you statistically significant results for changes over 1% within 2 weeks. This is very dependent on your service, frequency of use, number of users, etc. If you plan on running multiple variants, you’ll need bigger exposure.
Note: You could try segmenting your test by new vs. existing users. New users aren’t hung up on your old design, so comparing their behavior across variants can give you a good sense of where your product can evolve to and remove some of the novelty effect. You would still need to figure out a transition for existing users.
- Figure out a test plan for the new design. Often you’ll have to clearly list which features will be available and how the new flow will need to operate. Don’t be surprised if people don’t agree on these at first. It’s a good verification for your plan.
The execution phase
- Tell teams that are not directly contributing to the redesign to keep building on the old design, and to keep track of features they ship. This is painful, but it’s important for them to be able to ship and test with a majority of the user base, and avoids getting artificial pressure to ship a design that hurts the business in order to unblock other teams. See the misc section for more about this.
- If you need to estimate the impact of not having a certain component in the new design (yet), try turning it off in the old design to get a sense of how users might react to different flows. This isn’t my favorite way of testing (you’re making the product worse for users), but it will save you time in getting to a better experience.
- Define carefully the analytics events that drive your metrics. Make sure you know exactly which flows are critical (page visits, actions, navigation, time spent, errors), and make sure you have equivalent events between the old and new flows. When the time comes to understand why the numbers are going down, this information is invaluable. Test the analytics events and your dashboards with both flows before launching.
- Expose the rest of the company to the new design as soon as possible. They’re your best testers. Give them a platform to report issues, and if possible a place to debate them publicly within the company. You’ll get a sense of how customers might be thinking of the the new service and what sentiment you might expect to get. Good platforms for this are mailing lists, Workplace by Facebook, internal chat forums, etc.
- Test the new design in real world situations. The engineers in your company have the best networks in the world, while your customers use 2.5G phones. You have the largest phablets money can buy. Your customer might be on an old iphone 5 with a small screen, or a tiny android with limited memory. Your designers have that 4K 30” monitor, while your users have a VGA resolution screen.
- Get a good multi-variant testing system that will let you roll out, roll back, allocate users, keep holdouts, etc.
Pro tip: Assume that your multi-variant testing system will fail every once in a while. It will fail because of network issues, timing issues, temporary provider downtime, etc. In most of these cases a programmer-supplied default value takes precedent. Make sure the default is set to the old design, not the new one. You do not want to accidentally release this to many users without getting it well tested.
The rollout phase
It’s time to start testing. This is so exciting!
Your goal is to reach a point where you can show the company that this is worthwhile: your users and the business are all gaining, while you’re not losing anything critical.
- At this point people in the company should have already been using the new design for a while to identify any possible issues.
- Start with a small percent of users (<1%) to make sure things are working. Monitor closely, watch for bug reports and support tickets, monitor App Store reviews and social media, and look at business metrics to see that nothing really bad is happening.
- Assuming all is well, roll out to the minimum number of users you need in order to get statistically significant results and continue monitoring.
- Meet on a weekly basis to review current stats and formulate hypothesis. At this point you’ll see some things go up and some things go down, but might not know why. Get engineers, designers, PMs, data scientists, etc. in the room to talk about what you’re seeing and come up with theories. Say up front that any theory is valid, to avoid people getting upset if their favorite part of the design is criticized.
* Evaluate if the issue might be data. Are you logging correctly? Are you missing events? Are events incorrectly named in the new design?
* Find ways to test theories. For example if you think the placement of an item in the menu influences the drop (it does!), test positioning.
* If there are disagreements in the team as to why something is happening, consider testing both options. It’s important for people to be heard, and you might be surprised with what you learn.
* Plan as many parallel experiments as you can handle. It’s easy to fall into the trap of assuming you know what that one thing that will fix the problem is. You then wait a whole cycle to discover that it didn’t work.
* Pro-tip: novelty effect. Users seeing something for the first time might be interested in exploring it more and you’ll see higher numbers for a while, or might not be sure how to use the system and you’ll see lower numbers for a while. Numbers eventually stabilize, so make sure you don’t jump to conclusions. For example, adding a new dialog / modal that calls out a new feature / button might cause that feature to get immediate attention. Make sure you evaluate people who’ve already seen the dialog to see if their long term behavior actually changes.
- Communicate the current state / results / theories widely in the company. This changes from company to company, but the more visibility you provide, the more people understand what you’re trying to do and how you’re going about it, and the more support you will get.
Outcome: Woohoo your redesign is driving business results!
Once you’ve proven this is a great design, consider a full rollout. The rest of the company has to port more features onto the new design. You also need to get more users exposed to the new design. There are a couple of ways of handling this:
- All teams sign up to port and work towards getting the app ready for full exposure. Once everyone is done, you gradually roll out to 100%.
- You roll out to 100%; teams start porting to the new design after the fact. This might mean that some features won’t be available temporarily, or that the design of the app might not be uniform for a while. This is fairly typical for really big apps, where there’s a lot of work to do.
Outcome: The redesign fails
It might. This sucks, but you should be ready for it. You might have been going down the wrong path, or might have run out of time to try more things. Assuming you don’t just force the release anyway, here are a few things you can do:
- Remember that people on the team have been putting a lot of effort into this. It’s emotional, it’s a long project, and it’s something that’s visibly happening in the company. Acknowledge this and give them an outlet to grieve.
- Do a post mortem session and talk about what you could have done differently. Talk about the assumptions that started the whole process. You might find a faulty assumption at the base of the project, e.g. “This case works better, so let’s do it for the entire app” or “our competitor does this and they must know something”.
- Identify what you did learn from all your attempts. A failed redesign usually has some lessons learned that apply to the old design. Reenergize the team around porting these learnings into the old design and celebrate the impact this brings.
- Teach the rest of the company. Someone somewhere is planning the next redesign and can benefit from your knowledge
The metrics you set are important because they give you a more objective way to analyze how your new design is serving your customer base. You might be willing to sacrifice behavior A to increase behavior B. So long as the trade-off is clear, that’s not a problem.
In cases of proxy metrics, a dropping metric might make sense. Think of a conversion metric that looks at a step in the middle of the funnel vs. the actual conversion. You might have chosen to use this metric since you cannot see the final conversion on a different site. In this case, you optimize for how many people pass that middle funnel step, a proxy metric. Part of your design might weed out people who were not going to convert anyway, but the result is that now the proxy metric is dropping. Theorize if that’s what’s happening, then test it. If that’s really the case, it’s ok for the metric to drop, you just have to quantify how much of it is dropping because of this change.
I’ve had a case where showing some information at the top of a page reduced taps on a unit lower down. The theory was that the taps were going down because the information at the top was pushing the other unit below the fold. Our designer insisted this was happening because of the information itself, which was helping users leave the page early if it wasn’t relevant for them. We ended up testing both variants (reduce the size of the information, and remove the information while keeping the size the same). Our designer was correct. We ended up shipping with this drop since it improved the user experience.
Common cases and the people side of things
These are a few cases that might come up as you go through this process.
“The new design is great, the numbers just don’t reflect this”
This is usually used to justify shipping despite a drop in metrics. This one is hard to stomach. It’s usually rooted in emotional attachment to a specific solution, and frustration that it doesn’t translate well into your business numbers. It’s also a question of how up-front the team is about what it’s trying to achieve and what numbers really matter. You’ll hear “the numbers don’t show how much users like our service” and similar sentiment. This might be true, especially if your metrics don’t align well with your goal (e.g. proxy metrics or incorrectly set metrics). For other cases, it’s good to agree upfront about the metrics you’re trying to increase and why they matter, and keep talking about them throughout the process to make everyone focus on them and not on any particular solution.
Our CEO / VP Marketing wants to do a PR about the redesign so we can’t rollout gradually nor test for any length of time.
Education matters here. Talk about the underlying assumptions and about the plans you have in place. Talk about the actual value of the PR to the business. Sometimes a PR is critical because it gets you an uptick in attention, but over time it might not be worth the decline in business metrics. Instead, have a comms plan and be ready to respond to the press if they discover this and talk about the redesign. Here’s an example. Alternatively, be upfront that you’re testing a new design.
Our director is on the line to ship this, so we can’t back out.
Starting this conversation from the beginning is critical. If at any point you ask anyone in the management chain “How confident are you that this will ship?” and the answer is “100%”, have a discussion about success and make sure they understand failure is an option. Make sure their manager is also aware of this. You should still do your best to succeed, but try to avoid not being able to back out.
The designer / lead engineer / PM will quit if we don’t ship this.
They might. At this point you have to consider the worth of someone who’s probably very talented, but who’s willing to sacrifice the success of the business because of their egos. Again — from the beginning of the project, keep talking about the goal and reviewing against the goal so that everyone thinks about the outcome the same way.
The PM, engineers, and designers all have different goals.
Flag this as early as possible. If some people care about changing the colors while others care about increasing business, you’ll have a hard time making decisions on how to proceed. You’ll also get less cooperation in thinking through issues and what you’re seeing. If the team does not agree on the goal and the priorities, escalate and consider shuffling the team.
It’s human nature to prove that your theory is right. For example, if the metric is going down, you’ll question the data and make sure you’re logging correctly and analyzing correctly. If, on the other hand, your data seems positive, you’ll rarely check if it’s correct. If the data looks too good to be true, make sure to check why — it’s worth finding this out early on.
Other teams should keep working on the old design.
Let’s call the two code paths “OldDes” and “NewDes”. You optimistically assume that the redesign will be done and ready to roll out to 100% within 2 months, after 2 weeks of testing. Any feature being worked on for OldDes will barely be out before it needs to be ported over to NewDes. This will just slow down the rollout of NewDes.
There are a couple of fallacies here:
- You assume an optimistic conclusion.
- You assume the redesign is more important than anything else happening in the company and must not be slowed down. In reality, the business is the most important thing for the company, with the redesign being one possible path to get there.
Meanwhile, one of the many gotchas we discussed above happens, and your timeline extends 3x. Many teams have built features on NewDes, but they cannot roll these out to users and cannot even test them well within the 5% exposure you have.
Pressure mounts, and eventually someone makes a hard decision to force NewDes out to unblock the other teams. Your business just suffered!
Instead, tell teams to build on OldDes, and to plan a migration period once you prove NewDes is a better option. If your optimistic estimate was correct and it only took 2 weeks to prove NewDes is great, not much work has been done in OldDes and porting should be relatively straightforward. On the other hand if you end up shutting down NewDes, all the new features are in place and working in OldDes.
The old design might become much better performing than the new design because of all the new features being added. You can keep a tally of what the other improvements changed, or possibly keep a small holdout of users who see the old design without any new features, to have something to compare to. When it’s time to make a decision, you can make some assumption on the impact of porting the new features to the new design. It won’t be an exact estimate, since features work differently in different designs, but it should give you enough information to make a decision. And be aware — the old design might really be better.
Redesigns are hard!
Let’s be clear: doing a major redesign is a very expensive process. It’s very emotional for everyone involved, and has a high opportunity cost. If you care about the design positively impacting your business, you’ll need to find a way to do this and get everyone to buy-in.
Redesigns help you unlock potential gains you could not have had in the old design where you might have reached a local maximum. Remember:
- Articulate well what you’re trying to achieve and how you’ll measure it
- Plan enough time to give the team the space to experiment
- Celebrate success or learnings from failure
Have more tips? Please share!