Using research techniques to improve your Design System

How to quantify design system improvements by creating reliable baselines

Published in

Pion

9 min readJun 19, 2023

A good Design System builds constraints to harness creative chaos. It tells a story that others can understand and welcomes newcomers aboard with open arms, a toothy grin and a bottle of Bolly. It adapts and changes with the business and knows that it, the facilitator of prototypes, is itself forever a prototype. There’s no silver bullet and, like a true workhorse, the good design system knows that and doesn’t creak under pressure.

The efficient system creates an environment that fosters brand expression and teaches the design language broadly and in plain English. It’s a place where the actual work gets done and it provides technical structure, context and complexity to those who seek it (developers). It’s the keystone supporting the bridge between the designer’s style guides and the developer’s UI libraries.

A poorly organised system creates a misalignment earthquake, the ripples of which are felt across the entire organisation. An ad-hoc, free-for-all naming convention policy might release a bad smell when selling a prototype to a prospective client, when fumbling around trying to find designs in your mega-file which won’t load properly or when someone is trying to find designs without a designer. Files might not be in an agreed upon Team or Project structure which means it’ll be a nightmare to scale. New joiners will recoil at a messy system and, as Phil Dunphy once uttered, you only get one chance at a first impression. A messy system is the enemy to speed and innovation.

At Student Beans our Design System is used (and at the very least seen) by Product Design, Brand & Marketing Design, Developers, Product Managers, the Social Impact team, Product Marketing, Account Managers, the People team and many more. It’s utilised from Great Britain to the United States, from the Philippines to Australia. Our clients and students see our prototypes and files when we’re co-creating or conducting research. Our Design System is global, bigger than just the designers, and so this year we set out to improve it.

In my 16 years in the Design game, I’ve never seen a Design System I’ve thought as ‘The One’. How the system’s set up is always contextual to the business. The same is true at Student Beans.

After joining the business in October 2022, I set my focus to understanding how the designers work together and with the rest of the business. I found it difficult to navigate the 18-year old system (Student Beans turns 18 on the 21st June) on my own, and frequently requested help. I would get frustrated and move onto something else. I searched Figma for documentation and found some relevant artefacts but nothing that matched exactly how the system was being used.

This affected, albeit from a survey of 1, most of the points mentioned in the introduction; onboarding, understanding the story (the way we work or lack of Design Language) and harnessing creative chaos. I don’t think it did any of these things particularly well and, because of the vastness of the system, I couldn’t tell what was new or old, in-play or ready for delivery and my mind wandered to how the team handled sick-days, general exploration for inspiration, holiday handovers and more.

My hunch was if I found it frustrating and difficult to navigate on first impression then the likelihood is so will other people. This gave the basis for a first hypothesis, that if we improve the overall architecture of the system along with some business-wide training, we’ll be making the system more accessible and so will improve confidence in our processes and much more. We would be moving the needle in the right direction. The question was, where is the needle and what is it pointing at? This is where the baseline metrics come in.

How we built the baselines

Quantifying Design System improvement is difficult and without the baseline metric you’ll never know, apart from anecdotally, how well your changes are doing. You can get stuck in the minutiae and locality of the design team itself; looking at design time cycles, components used etc. Whilst also important, that’s too narrow a view given the impact a shoddy system can have on the business. We needed Big Picture metrics.

We needed a baseline for the entire system, business-wide, and a baseline for what we deemed as the most valuable problem area; Discoverability.

The System Usability Scale

For our business-wide metric we used a System Usability Scale (SUS). This is an efficient way to gather quick feedback from many people around the business, something which requires no working knowledge of Figma, either.

With that baseline — which was less-than-flattering — we had a metric with which to measure against using the same survey, after we’d made whatever changes were necessary.

A small problem of sending out a SUS survey related to a system which is under-utilised, misunderstood or causes apprehension is it will get ignored or pushed to the back of to-do lists. As ever, you need to be prepared to work hard for those entries and be grateful to those who provide you their thoughts. Another thing to remember is it doesn’t matter if the baseline is already really good or if it’s terrible. What matters is you have the baseline metric and you’re focussed on using design thinking to improve it. Your organisation is endorsing you to spend time on something that’s quite hard to quantify; this says something about how your business views or values design and goes a way to increasing your UX maturity.

To Moderate or Un-moderate?

There were two trains of thought for setting the Discoverability baseline. The first was unmoderated tests. Perhaps we could save time by designing a test which our colleagues could do in their own time. Given the size and complexity of the system and the ambivalence towards it (shown in the SUS score), we knew this would be inefficient and too difficult to pull off properly. So we decided on moderated tests with a facilitator, clear scene-setting tasks and metrics to set focus. A 1–2–1 video call is harder to ignore than a Slack message, too.

Now we knew the format, we needed to target who we were going to speak to. Of all the departments listed earlier, there are three key groups that needed to be prioritised; the Product Managers, the Designers and the Developers as they use Figma every day. Improving the system for these groups means more autonomy, more speed and a smoother delivery process.

We then wrote the tasks. We wanted to focus on Discoverability and so came up with three high-level scenarios which tested the design language and file structure used in the system:

Task 1 — The Easter bunny has hidden an egg in our Design System. You need to find the new Easter Egg which has been planted in the most up-to-date version of the Saved tab on iOS, because you might want to use it in your work. Stop when you find it.
Task 2 — You want to find the full, most up-to-date flows of the Search on Web. Don’t stop until you find it.
Task 3 — You’ve heard about some interesting work being done involving X. You’re curious, and you want to have a little look at the initial ideas because you’d like to know more about it. Stop when you find the X page.

Now for the measurements. Our focus here was on:

Number of Hesitations — qualitative and quantifiable. Hesitations cause doubt and too much doubt causes abandonment and affects morale, energy and confidence.
Number of Wrong Turns — qualitative and quantifiable. Too many wrong turns can lead you down the garden path. Without direction, 1 wrong turn could lead to abandonment and a bottle neck.
Task Completed — a key metric. In theory the higher the number of hesitations and wrong turns the greater the likelihood of an incomplete task.
Time-to-Task — the money maker. Quantifiable. You can calculate actual cost to the business with this metric. You can then put a cost on the efficiency (or lack thereof) of your design system.

So, due to time and knowing this research could grow too beastly in itself, we identified 3 Product Managers, 3 Designers and 3 Developers and we asked them:

On average, how many times a day do you think you:

a) Ask a designer for a design, flow, or element of a design or
b) Want to look for or need something in Figma

This number would be the multiplier for our Time-to-Task metric, which we conservatively capped at 5 minutes. In reality, time searching a system could fluctuate.

results — PM 1 requests something in Figma 3 times a day. If PM 1’s time-to-task scores is 5 minutes, then PM 1 on average spends 15 minutes per day needing something from Figma, and so 75 minutes per week spent lost in the wilderness.

Running the Sessions

Next we ran the sessions and noted down thoughts spoken out loud. We found the system made people feel ‘silly’ and anxious because they thought they knew it well, given the amount of time they spend using it.

We timed the interactions on my iPhone; if it ain’t broke, don’t fix it. No need to go with fancy tools.

We noted wrong turns and hesitations with tally marks on paper which were counted later. This was done to allow our colleagues our full concentration.

After all the sessions were complete, we multiplied all Time-to-Task data with the corresponding daily Figma requests, created a monthly figure and averaged the number out across the three people within each discipline.

We then extrapolated those figures (Time spent per month in mins) as an assumption across all members within each discipline where Figma use would be relevant, and converted the minutes into hours as you can see in the right-hand column below. In our case it was 8 x PMs, 5 x Designers, 20 x Developers.

Final data — 2,667 minutes (44 hours) spent in Figma for 8 PMs. 160 minutes (3 hours) for 5 Designers, 3,747 minutes (62 hours) for 20 Developers.

The Final Efficiency Average

Once we had the extrapolated averages across all three tasks for all three groups, I then averaged those numbers together one more time to get the final efficiency scores per month, in hours.

For example, across 8 Product Managers 37 hours is spent searching Figma for something, 60% of the time — worked out by the ‘Task Completed’ metric — leaving them empty handed and potentially frustrated. For one Product Manager, that’s 4.6 hours per month spent lost in Figma, or 55.5 hours yearly, which is the equivalent of 7 working days per year, per Product manager. That’s 56 days per year, almost 1 full quarter, spent lost in Figma.

We can now attach a cost to the efficiency of our Design System across key users in the organisation. We have our strong baselines with which to measure OKRs against; increase the SUS score to X, reduce time-to-task by X, reduce amount of wrong turns by X, etc.

It’s worth noting that this is an experimental way of quantifying design system progress. As ever, we’re not 100% sure if it’ll work but that’s the fun of it. It’s a prototype and like any prototype it exists to be tested, broken, fixed and improved as time goes on.

Using research techniques to improve your Design System

How to quantify design system improvements by creating reliable baselines

How we built the baselines

Written by Max Dunne