Why Investing in Data Science Infrastructure Enhances Product Development

Maya Shaked
The Patient Experience Studio at Cedar
5 min readJun 2, 2022

A lot has changed at Cedar since I joined at the start of 2020. We are a multi-product company, have spread to all corners of the US, and can no longer seat most of our New York office at two lunch tables. As a data scientist on the Data Infrastructure & Machine Learning team, I’ve also observed our Data Science organization grow from a small but mighty team of four to a fifteen person powerhouse covering product analytics, commercial analytics, and machine learning functions. Our data models have grown in complexity as our product evolves and our analysis techniques have adapted in sophistication. Naturally, as our data scaled, so did the number of pain points we encountered, particularly with product analytics.

Fields were defined slightly differently across analyses; Code reviews in Jupyter Notebooks were ineffective for complicated queries; turning on an A/B test relied on engineering resources; documentation grew obsolete as new features rapidly launched. These frustrations revealed a need for more robust internal data science tools and inspired the buildout of our Analytics Library one year ago.

At its core, the library is a toolkit of statistical and analytical functions custom tailored to our data. After much discussion and external research we architected the Analytics Library to accomplish three crucial objectives.

Improve analysis reliability and reproducibility

Cedar relies on its data scientists to provide actionable insights that drive product development. For example, we are currently running an experiment to test whether offering more customizable payment plan options will increase plan adoption or whether it will actually create more friction for those considering payment plans. A miscalculated metric could lead to the rollout of a feature that has no improvement — or worse, negative influence — on the patient experience.

To address this issue we moved toward a more rigorous code review process where all product analyses are version controlled and reviewed by at least two other data scientists. We started writing automated tests for our tools and queries that run whenever we ship new code, making significant headway in our goal of reaching 100% test coverage. This goal stems from the principle that code for analyzing data should be held to the same high standard as that of what runs in production.

In the past, introducing our analytics processes to new hires was a challenge. This was in large part because code was inconsistently documented. To tackle this issue we implemented Sphinx, a tool that automatically generates documentation from the Analytics Library codebase. Now, new hires can easily navigate our experimentation framework with a neat user interface. The documentation has visibly made an impact on team productivity, as new hires are now taking ownership of A/B tests within their first month of joining Cedar.

Increase tool flexibility

While there are a number of helpful open source packages we utilize for statistics and data manipulation, there are characteristics unique to our data that require special care. For example, one question we ask ourselves at the end of an A/B test is: should we turn this feature on? Answering this question is not so straightforward if we are optimizing for collection rate, defined as the total amount paid by the patient divided by the amount they were billed.

In a typical A/B test, one can use a binomial proportion test or chi squared test to compare some metric between two groups. These tests assume that the data is normally distributed. However, our collection rate data is far from normal. The majority of bills are under $200 but there is also a non-negligible 2% of patients with bills over $2,500. Payment behavior varies drastically depending on bill size and we didn’t have a statistically sound way of measuring the impact of a new feature on collection rate.

To address this need we built out a custom suite of bootstrapping and permutation testing tools. Both methods utilize the concept of sampling with replacement to generate the distribution of a parameter (in this case, collection rate). These tests are powerful because they don’t make assumptions about what the underlying data looks like. We used these tools to visualize collection rate confidence intervals and make conclusions in a recent experiment around SSN verification.

Speed up experiment analysis

In the past, we used to dedicate a full week to estimating how long we might need to keep an experiment running for and what effect size we may be able to detect. This process is known as power analysis. Defining the experiment’s target population, baseline metrics, and underlying distribution was a time intensive process, especially because generic sample size calculators found online are not suitable for our patients’ wonky range of bill sizes and payment behavior. We built out functionality to calculate sample size necessary and minimum effect detectable that tailors to our data. Now we can conduct thorough power analysis for a proposed experiment in under one day.

We have historically run into challenges when QAing experiment data post-launch. There was no centralized system for checking whether initial data looked as expected. Team members wrote their own one-off scripts and conducted QA slightly differently. This inconsistency inspired an automated process, by which running one line of code retrieves a summary of whether data is randomized appropriately between experiment designs, whether there is any unexpected user behavior, and whether there are skewed data or outliers.

Who benefits from data science infrastructure?

Over the course of the past year, the Analytics Library has matured from a repository of ad-hoc queries to a suite of flexible tools that increase the speed, quality, and reliability of analyses. Feedback from the team has been overwhelmingly positive. We discuss improvements regularly in team meetings and gather suggestions in comprehensive quarterly surveys.

As a data scientist on the team, of course I get excited about how the Analytics Library has made my work more efficient. But, ultimately, that isn’t what makes it a valuable investment. Robust data science infrastructure dramatically improves product development. Decreasing experiment launch time and standardizing statistical analysis has allowed us to make smarter decisions earlier on about what features will benefit patients and providers alike. Patients receive an enhanced digital experience that gives them control of their finances. This increased patient loyalty then results in superior long-term financial results for providers.

Our home grown analytics toolkit is far from complete. In the coming year we plan to focus not only on hypothesis testing and experimentation, but also on data visualization, predictive modeling, and tools for clustering patient and provider archetypes. We’ll be sharing updates on these efforts in future blog posts, so keep an eye on our publication to follow along with our progress!

If you’re interested in joining Cedar to work on these types of projects, check out our open jobs here.

--

--