The Secret Behind LCM 2.2 Re-design

Alison Cheng
10 min readMar 12, 2020

--

The Power of Foundational User Research

Lifecycle Manager (LCM) is a service used by IT admins for data center infrastructure upgrade. It’s such an important tool that many customers cite LCM’s upgrade capability as a key factor in their selection of Nutanix solutions. However, over the years, it began to develop a “difficult to use” reputation. This reputation steadily grew until the LCM 2.2 release ignited a turnaround. Some customers commented that “this is the strongest release (yet)”. With more robust engineering, and a new UX paradigm, LCM 2.2 has received great customer feedback. So, what role did design, and user research, play in helping to shape a product experience good enough to regain customers’ trust?

Background

LCM started as a firmware upgrade feature that sat within Prism Element. In the 2.2 release, the product team decided LCM needed to become a full-fledged upgrade service that supports both firmware and software upgrade. I started to work on LCM in late 2018, when LCM was transitioning to a comprehensive upgrade tool.

Back then, I received several feature requests and jira tickets on bug fixes, such as “dependency information is not clear”, “too much white space in certain pages making users can’t see as much information in one place”…

Things to be improved in LCM 2.1 (previous)

After several explorations and not feeling satisfied with any of them, I suspected that a 10x user improvement might be found in the component-centric UX paradigm at the heart of the information architecture and all task flows. At that time, the UX paradigm made finding/viewing all the upgrade info for a given host a tedious, largely manual, process.

While we could point directly at several usability problems in the product, the question we really needed answered was:

How do users decide what to upgrade, and how do they want to deploy upgrades?

When both BIOS and HBA upgrades are available, do admins upgrade BIOS and HBA across their environment (current component based view), or do they upgrade BIOS and HBA in host 1, then move on to upgrade them in host 2 (host based view)?…

Typical usability testing, in which we show participants a task flow, and ask them to comment on its appeal, wasn’t a good choice for answering these questions. Asking users to evaluate designs, without first understanding the users’ organizational structure, collaboration model and upgrade practices, could lead to us making some very wrong fundamental assumptions.

Although conducting UX research on LCM wasn’t our initial intent, once we thought more deeply about the challenges, it became a promising approach. If we were to spend our development and design resources wisely, we needed to make sure we understood our customers.

Many think UX research is independent from design, but UX research should be a part of all design activity. Good product design is not about wiring together technology to make a capability possible. Good product design starts with understanding the user needs and using that to define how a technical capability should be exposed to a user. This is typically impossible without knowing who you’re designing for, and what their daily life is like. Usability testing is the methodology with which most people are familiar, but in terms of lifetime value to our customers, foundational research is perhaps the most cost effective and valuable.

How’s Foundational research different to Usability testing?

Usability testing (aka design validation): is a way to test the usability of a product, or a specific workflow in a product. It is most valuable when designs are near completion, or when the design choices have been narrowed to two, maybe 3, approaches. Participants essentially complete specific, pre-selected tasks and give their feedback on the usability issues they observe. This type of testing assumes that the underlying task flows, and information architecture, match real customer needs (e.g. user doesn’t have to go one place to find a piece of data, write it down, go someplace else write it down… before completing a task). However, user testing, effectively puts blinders on us, and any users involved in the testing, to focus just on the specifics of a design rather than how well it fits in the real world use cases and contexts.

Foundation research: is a way to to understand questions about critical use cases, processes, information flow, tool chains, and organizational structure. Understanding this information can provide important product direction, and long term value to customers. In this type of study, we are not focused on where the buttons should go, or what things look like. We are more interested in “how users do their work and why they need to do it that way”. It is the user-empathy gleaned from this work that enables a product team to solve real problems for the user, rather than just solve the Nutanix inward-focused problem of building technical solutions as quickly as possible.

Often times people equate UX research to usability testing, but usability testing is only one, very basic, research technique. Foundational research is much more important in terms of providing product strategy guidance and bringing business value. In my case, to understand different organizations’ upgrade processes, I utilized foundation research by conducting interviews with real world users.

If you are interested in doing foundational research, here are a few techniques I used and some tips I learned along the way.

#1: Don’t ask users, “What do you find frustrating?”

Managing IT infrastructure requires domain knowledge, professional training and years of experience. The IT admins we talked to had 5–10 years work experience. They’re used to reading lengthy documentation, making spreadsheets to keep track of the installed version of their data center, calling vendors to get compatibility information, and staying awake most of the night during an upgrade.

In foundational research, we avoid asking users questions such as “what do you want” or “what is the pain point?”. Doing so will usually result in specific flaws, but is ineffective at revealing root causes and true solutions. After years of doing something, human beings lose sight of the bigger picture and when pressed for answers predictably only produce immediate, typically superficial “fixes”. As Ford said, if we just asked people what they wanted, we would’ve been told they want a faster horse that eats less. Don’t expect users to give you strategic answers in an interview. Instead, let them describe their current environment, processes and goals. As the product designers and builders, it is our responsibility to uncover the deeper solutions. For example, an admins’ job might involve copying and pasting a variety of installed version data to a spreadsheet. Users are likely to say, “make that easier”, and assume that’s just something they will always have to do. They are unlikely to suggest that there might be a better design that eliminates that need entirely.

In our interviews, we learned that many companies’ policies require them to perform quarterly upgrades. Before an upgrade, the system admin will read release notes to understand what has been changed. If a certain release contains major bug fixes, this upgrade might be escalated to an emergency upgrade which will be conducted as soon as the department allows. If it’s a feature improvement, then admins will wait till the company wide quarterly upgrade. System admins will then make recommendations, and have several meetings with upgrade operations admins and various managers, to come up with an upgrade plan. The plan will make sure the upgrade is within a maintenance window and poses minimal risk to the company’s operations.

Since LCM doesn’t provide a time estimate on how long an upgrade takes, admins might first upgrade one or multiple components on a host to have an idea of how long an upgrade takes. If everything goes well, they’ll then upgrade the rest of the hosts, (one, or few at a time, depending on how long their maintenance window is), until they finish updating all the hosts in a cluster.

Upgrade progress in large organizations

All this information about their organization structure, how they plan and conduct their upgrade processes, the operational rules they have to follow gave us clues about how LCM’s design could more seamlessly support their goals.

#2: Keep asking why

Users often don’t know what they need. In a foundational study, our goal is to help them realize and articulate their true needs. One useful technique in user interviews is to keep asking why. For example, you have a friend “Joe”, and you’d like to know about his needs.

To understand what he needs, you get started with a few questions:

“Hi Joe, how do you come to work?”.

Joe says he bikes to work, but thinks the current bike is too old. You might just stop there and make a note to get Joe a new bike. However, a good foundational researcher will ask:

“Why do you come to work by bike?”.

Joe then informs the researcher that he bikes in order to take the opportunity to exercise and stay more healthy.

It turns out what Joe really cares about is his health, and just providing a new bike won’t help, unless it helps improve his health relative to the existing bike. From there, you may open a set of questions that drill down on how the current bike isn’t satisfying the improved health requirement. Just accepting the first user request without understanding “the why”, means that you could completely miss the real requirements, and spend dollars on a “new bike” that still doesn’t solve Joe’s true problem(s).

In the LCM user study, we asked users which view they prefer when deploying an upgrade.

Component view vs Host view

Some users preferred the host view, and a few didn’t have a preference. We asked “why” as a follow up question. Turns out, these two user groups have different environments. Those who upgrade host-by-host have a large environment and have strict maintenance windows. Upgrading one or two hosts at a time, or upgrading one or two components in a host at a time allows them to keep an upgrade in the maintenance window. (Although all users said if LCM’s upgrade speed is 10 times faster, they could potentially upgrade everything all at once after validating).

Those users without a strong preference, worked in smaller companies and have a small environment, typically 2–3 nodes in a cluster. They upgrade the environment whenever upgrades are available and will upgrade everything in a cluster at once.

If we didn’t continue to ask “why”, we wouldn’t have understood how their size, systems, organization, and policy were really at the root of their needs. As a result, we could’ve addressed all their list of band-aids, but still not substantially improved their overall experience.

#3: Interpret findings with caution

The typical user research journey follows the path below:

There is no guarantee that a user study will generate useful findings and insights, because many things can go wrong if not done properly. However, this can usually be mitigated with proper planning and techniques. Yet, even when we obtain useful findings, we still have to translate those insights into a good design.

With limited data, and different user upgrade behaviors from various sized organizations, how do we interpret the findings? This is when a UX professional can truly deliver value to their product team. In our case, we believed that both general directions had business value, and we needed to ensure our design solution didn’t help one, while hindering the other. One can’t simply ignore logical use cases just because they didn’t appear in a sample of customer interviews.

In this case, I sorted different types of behaviors into two 2 categories. Customers who upgrade at host level were considered primary users. While no participant in the interview said they deploy upgrades at component level, I didn’t rule this possibility out entirely. Users who upgraded at component level were considered as secondary use cases. The final design prioritized the design for the host view but the final view we delivered effectively addressed both users’ needs.

How did the design change?

Workflow before
Workflow after

The redesign specifically streamlined the update for the scenario when multiple components updates are available such as BIOS, BMC and HBA. The user can now select all of them in a host and with one click proceed to upgrade in re-designed workflow.

In addition to understanding the user upgrade process, we also learned that auto-update of firmware was the main feature request from these users. This was unexpected by the entire team, and may be a story for another time.

Results

LCM 2.2 user feedback has been very positive. Some specific comments on the host-level view included:

  • I wanted to say thank you for the new LCM 2.2 — I like the new look and feel — with a small button I can even see the versions of NIC drivers and versions/quantity of disks easily!
  • Great job with LCM 2.2! I wanted to provide some positive feedback from Exelons’ Dave Starcher… He’s since upgraded to 2.2 and has been very complimentary on the reliability while running inventory. He also complimented the other feature enhancements, specifically the ability to upgrade a single host at once. The actual quote was “I feel they took all the things I said I wanted and put it in”.
  • Like this compact view a lot. I don’t like information hiding in multiple layers and I have to (manually) find them.

Through the effective use of Foundation UX Research, we were able to validate assumptions and better understand our users’ true needs. The holistic findings and insights gained from the foundational research were unlikely to be understood by just reviewing customer support calls, or lists of Jira defects. Only by stepping back from the specific line items, talking directly to customers about their broader contexts, digging deeper than their direct complaints, and persistently asking “why?”, were we able to get to root causes, and define the fundamental UX paradigm changes necessary to substantially increase user satisfaction.

--

--