Building a design system without knowing how it is used can be very tricky, and even the simplest of questions can be hard to answer. Do you know how often your components are used? Can you remove deprecated components from code? What is the adoption of your design system?
We wanted to know the answers to these questions, and that is why we started an initiative to measure our design system.
Looking for the right metrics
Generally, you can measure a design system in two ways — in soft metrics and hard metrics. You can measure soft metrics through surveys such as SUS or NPS, which will provide you with information about how satisfied your users are. But since we wanted to know the adoption of our design system across our codebase, we needed to go with hard metrics.
We spent a lot of time determining what specific hard metrics we wanted to track and what questions about our design system we wanted to answer.
It was difficult for us to define what a fully adopted design system means. Does it mean that there will be no other UI components defined outside of the design system? How do you compare it to other non-design system components?
To answer these questions, we experimented a little bit with visual measurements, as we thought that it might answer our main question about adoption. This gave us very interesting insight into the state of our application. However, we soon realized that visual measurement is hard to quantify as a metric. Our idea with this was to color every component coming from the design system in one color, allowing us to see the design system coverage on a single screen.
What we realized after the fact was that we hardly had a case where a screen used only components from a design system. There always will be components that are useful for specific parts of an application, which are maintained by a respective product team. This means that every screen needs to have a manually set threshold of percentage coverage.
All of this makes visual measurement hard to quantify, but it makes a good metric for areas of improvements, as people can very easily identify places that can be covered with a design system from screenshots like these:
In the end, we used visual measurement as a secondary metric. And we settled on primary metrics being those that we can exactly measure and compare against each other:
Tracking instances of components over time:
- Tracking instances of deprecated components
- Tracking usage of props on React components
Tracking progress of migration:
- Migration of components
- Migration of custom typography
In addition to this, we wanted the ability to scope everything to a project or specific team, so we could see how each project is adopting our design system.
Based on the metrics defined, we started crafting a technical solution. From the requirements, we knew that we would have to know the instance of our components and also the instance of custom typography. This led us to two solutions. Let’s dig a little deeper into both of those.
Measuring instance of components
You might think calculating number instances of components could be accomplished with a straightforward static code analysis. And you’d be not that far off. We considered writing our own AST parser, but in the end, we found the perfect open-source project: react-scanner. As our codebase is fully in React and Typescript, react-scanner matched our needs perfectly.
What is great about this library is that you can access raw data from a scanner, which we use, and just slightly transform the data to fit your needs. In our case, this meant mostly adding IDs to reports and components so there is unique identification for every component in our data analytics tool, but more on that later.
Measuring custom typography definition
We wanted our design system to have complete control over typography, so we could easily change the style of it across our application without much hassle.
We knew how many components of the Text component we had, but we also needed to know how much custom typography there was, so we could compare those two numbers.
To answer this we wrote custom ESLint and stylelint rules that give warnings on definitions of font-size, font-weight and line-height where the value is not a token coming from the design system. A warning message then advises to use components from the design system instead. This has two advantages: first it warns about forming new technical debt, and secondly, it gives us an option to actually track this.
I won’t go into much detail about our front-end tooling, as we already have a nice article written about that. Just know that we have a monorepo with more than 200 projects, where every one of those projects has been defined and can run ESLint and stylelint.
We added our ESLint and stylelint rules to the global settings, so now every project in monorepo has these rules. With this, we were able to track those warnings. We just ran and saved output of ESLint and stylelint to a JSON file using the `outputFile` CLI argument for all of our projects.
This gave us a lot of JSONs with warnings not only for typography but also for any other warning in our codebase ESLint or stylint tracks. We reduced those files in our codebase to one file with this simple format:
And just like that, we now know how much custom typography or lint warnings we have in our codebase. We can now simply compare the custom font definitions with Text, Heading, and Link components, giving us a clear overview of the situation of custom versus design system controlled typography.
What else do we get?
At this point, we had a lot of data: we knew instances of our components, and we also had all of the ESLint and stylelint warnings gathered. The pain point of this was that we worked with them manually from scanning to making reports. As you can imagine, this was not the best use of our time, and so we automated this process. At Productboard, Looker is our data analytics tool of choice, so it was natural for us to push this data into Looker, allowing us to work with the data better and share our results across the company.
To give you a rough idea of how our data pipeline works, we run a couple of analytics jobs in our scheduled GitLab pipeline, which runs components, ESLint, and Jest analytics and stores those in GitLab artefacts, where they are pushed into Looker and processed further.
We are now able to inspect all the data gathered in Looker. We can make dashboards that track specific initiatives, like, for example, contributions to a design system and component adoption. All of this can be viewed as a whole or scoped to specific projects in monorepo or a specific team. This data is refreshed on a daily basis and is accessible to everyone. It’s possible to create derived dashboards, so the possibilities are endless.
Since we are able to track the usage of every component, we now know what we should prioritize with maintenance, regardless if the components are frequently used or not. All this data is crucial for planning and maintaining our design system, and it would be hard to imagine going back to a time where we didn’t have this deep insight.
I hope you enjoyed this article. If you like it or have any questions, please feel free to comment below👇.
Interested in joining our growing team? Well, we’re hiring across the board! Check out our careers page for the latest vacancies.