8 Learnings That Improved Quality in Our React Design System

The design systems team at PriceRunner shares what they have learned from improving quality and stability of their React Design System Fabric.

Andreas Oldeskog
PriceRunner Tech & Data
9 min readApr 15, 2021

--

Developing a design system for a large website is hard. You want to maintain a steady stream of improvements, features, fixing bugs and battling technical debt, while at the same time avoiding introducing new issues. A common problem for most technical systems really.

We have worked on our design system for a few years and last months we have focused on adding a number of quality checks that will help us keeping our React design system a stable product.

Let me first shortly present Fabric. Fabric is our design system at PriceRunner that we have worked on since 2019. Like for many companies it started from a need to increase design consistency, component re-usability and productivity for developers. We have since then worked hard to include the components that we need, and have now over 130 components in the system.

Fabric is our design system at PriceRunner

Quick look at our tech stack:

  • React with TypeScript, Sass and Emotion
  • Storybook is used for documentation
  • Over 130 components
  • A team of four — 1 PO and 3 developers
  • Used by both internal and external apps, like PriceRunner.com

Type checking our code with TypeScript

When we started the initiative to improve our quality assurance process, TypeScript was one of the first things we decided to implement. The React code base was written in JavaScript with React Prop-types. While Prop-types do give some increased developer experience they are easily missed and do not give compile time warnings.

We were able to quickly reach over 90 % type coverage for our 60 000 LOC code base using the automated migration tool ts-migrate, and have since then continued to push this higher and higher.

With TypeScript in place we noticed that:

  • We had missed proper checks for nullish value
  • We passed props around to functions without checking its type (e.g. React.ReactNode passed to a function expecting a string)
  • We were not consistent in our Props API in terms of naming and types (with TS we can reuse types to enforce this)
  • We have increased productivity due to code completion, type hints and auto imports
  • We have higher confidence in refactoring our code base
  • Our consumers can quickly learn a component with better documentation in Storybook and editor

Integration testing with React Testing Library tests

In our effort to build the groundwork in our design system, writing tests were down prioritized. To get back from this technical debt we set a few ground rules for our test strategy with React Testing Library.

  • Verify that all content props are part of the render output
  • Render component with nullish prop values and make sure component renders as expected
  • When needed, trigger events using fireEvent and verify expected callbacks are triggered
  • Fail tests when React prop-types warnings are thrown (not all of our consumers use TypeScript, so we need prop-types)
  • Run automated accessibility tests using jest-axe
  • Verify component works in a server side rendering context (see example below)
  • We do not use snapshot testing since it couples the test too tightly to the DOM output
How to test components in SSR context, in a Jest test (Gist)

Automated tests are relatively fast way of making sure your product works, but they can also be frustratingly slow when you are waiting for the CI pipeline to finish. To highlight this issue better we can visualize the test running time in a Grafana dashboard, so that you can detect slow running tests.

Visual regression testing

RTL tests are great at making sure components work functionally, but they do not catch if a padding has unexpectedly disappeared in an edge case. This is where visual regression testing can shine.

When we researched a good service for visual testing we wanted something that could be run for every commit in a pull request and give us a report with all visual changes.

  • Detect visual changes and regression for every commit in a pull request
  • Snapshot multiple view ports (mobile, desktop)
  • Let us manage review and approvals within the Gitlab interface
  • Should not trigger too many false negatives
  • Should not be too expensive, or free if possible

We found a few great alternatives; Percy, Chromatic, Loki and Reg-suit. Percy and Chromatic cost $99/month and $149/month for 35 000 snapshots a month. Since we have over 600 stories in Storybook and want to have both mobile and desktop snapshots on every commit push—it quickly gets expensive with paid services.

Reg-suit is an open source product that offers a complete solution for comparison, storage and report generation. The author has also created a library for capturing snapshots from a Storybook instance called Storycap. This product fit very well with our requirements, and at a software cost of $0/month.

Visual testing in a pull request flow with Reg-suit and Storycap

We have configured Reg-suit to run in both pull request pipeline and main branch pipelines. Pull request pipelines will post a report comment with all detected changes, while the main branch pipeline will store the snapshots as a new expected snapshot set.

Reg-suit posts a comment with visual changes GitLab Merge Request

The handy thing with getting a comment within GitLab MR is that we do not need to leave the interface to see changes. In some cases we want to dig deeper, then we can open the complete report and use Reg-suit’s tool for diffing.

Reg-suit report with tools: “diff”, “slide”, “2up”, “blend”, “toggle”

All in all we are very happy with the visual testing tool. It has saved us a lot of time in the review process since we can directly see where components have changed. We have been saved from most weird false negatives that can happen with visual testing (e.g. anti-aliasing pixels), but we did have to turn off all CSS animations since that would trigger changes every time.

Continuous performance testing

In our Storybook instance we have added the excellent addon from Atlassian storybook-addon-performance. It performs a set of performance tests against your stories in Storybook. For example:

  • “Render to string” time and size
  • Initial render, re-render, hydrate time
  • DOM element count
The storybook-addon-performance tool from Atlassian

The addon is a perfect tool when you are working locally to fix a performance issue with a specific component, but when you want to answer the question “Have we made this component slower in the last month?” or “What are our slower components?” — it’s very hard.

We figured that it would be possible to continuously run these tests and store them in a Prometheus instance. That way we can display graphs in a Grafana dashboard, where we would be able to answer the earlier questions.

A new CLI was born to solve this. It fetches all available stories in the Storybook instance and loops through them to start the performance tests. When finished the stats can be pushed to a Prometheus Pushgateway, or rendered as a table/JSON/CSV in stdout.

Our internal CLI tool that will traverse all stories and collect performance metrics

When this tool was in place we created a new GitLab Pipeline that would run every 2 hours and push the data to our Prometheus instance. Eventually we had enough data to start experimenting with a new Grafana dashboard.

There are two sections in the dashboard. The first gives a detailed look into a specific component, while the second is a summary view of all components. In other words, we can identify when a specific component became slower and which components perform the worst.

A selection of performance graphs we have in our performance Grafana dashboard

Bundle size metrics

Besides managing runtime performance we want to make sure our dependencies don’t grow too large or have unexpectedly large compiled CSS from SASS. Just like the performance metrics above, there are lots of great tools for generating HTML reports based on Webpack stats JSON file. We found a tool called bundle-stats that can output both HTML report and an aggregated JSON file. The JSON file can then be parsed and sent to a Prometheus Pushgateway, so that it, once again, can be presented in Grafana.

We don’t have a ready dashboard for this yet, but our idea would be to present:

  • Summary bundle size for CSS and JavaScript
  • For each component display CSS and JavaScript size. Both with and without dependencies
  • All dependencies and their CSS/JavaScript sizes

The advantage of storing this over time instead of doing a one-off report is that we can much easier detect when bundle size increases and what components or dependencies caused it.

Example of bundle report with Bundle-Stats

Understand how components are being used

When creating or refactoring components we want to make sure we base the decisions based on data instead of guesses. If there was just one consumer to the design system that would be as easy as searching in that codebase, but when it is over 20 different project that quickly gets tedious.

To solve this we decided to use Sourcegraph. It will index all of your repositories and let you search them at once. That way you can make a query to find all instances of “<ProductCard” and learn how the components are being used, what are the most common prop values etc.

Sourcegraph also offers a GraphQL API for doing searches programmatically. We used the API to create a UI in Storybook that will find all places we use a specific component.

Custom UI within Storybook, built on top of Sourcegraph API

The UI will list all files referencing the Stack component and you can further filter by its props. An inline view of the file is shown when clicking the row.

A next version of this tool would be to display an aggregate list of what the most used props are and their respective values. That would help settle decisions when refactoring component APIs.

Review process

All of our work is done in feature branches and the work is pushed to a pull request. As per CI practise, the feature branch will be built and tested. We also deploy a preview version of Storybook so that the reviewer can test the component in isolation. For a more integrated test we deploy a preview of the public website (pricerunner.com) with our Fabric changes.

After the reviewer has verified the functionality in Storybook and website, she can continue to review the code changes and inspect the visual testing report. Finally ready for approval and merge to main branch. Our semantic-release setup will take over to version the changes correctly before publishing to the NPM registry.

Quality score and health metrics

To get a better grip on what components need further improvements we regularly perform an inventory of our components. In this inventory we score how well each component matches our quality checks. For example:

  • Do we handle data states such as error state, empty state and loading state?
  • Do we support all needed design states, like theme variants, colors and responsive states?
  • Do we have visual states for user interactions such as touch, hover, focus?
  • Do we have proper documentation and examples in Storybook as well as JSDoc comments in code?
  • Do we follow accessibility standards and support necessary keyboard navigation?

Our team also has a set of product health metrics. The health metrics define a baseline for what we determine as a healthy Fabric. When something goes outside our goals we need to prioritize to fix it.

  • Our components runtime performance is within our goal
  • Individual component has a quality score equal or above our goal
  • Overall component quality score for Fabric is equal or above our goal
  • Test coverage is within our goal
  • Include >90% of the most popular components found in open source design systems

Summary

All of these processes has helped us greatly in making Fabric a better product. It is not bullet proof — we are still human in the end, and we are continually looking at ways to improve this process.

PriceRunner is a service for comparing product prices, reading user reviews, expert product reviews and online shopping guides. We exist in United Kingdom, Sweden and Denmark. Browse products at https://www.pricerunner.com.

--

--