Experiment Results Framework at Cedar

Published in

The Patient Experience Studio at Cedar

5 min readJan 23, 2024

Allow me to take you on a personal journey within the realm of data and experimentation at Cedar. I joined Cedar in 2021, a significant year for me as Cedar had recently acquired the company I worked for, OODA Health. My transition from a two-person team at OODA to Cedar’s Data Science team, a dynamic group of 13 professionals, was exhilarating.

As I embarked on this new chapter of my career, Cedar’s commitment to innovation and data-driven decision making became abundantly clear. The company’s experimentation initiatives stood out as one of the driving forces behind its growth and success. This blog post will explore the world of experimentation, data aggregation and the evolution of the Experiment Results Framework at Cedar.

How experiments are conducted at Cedar: An overview

At Cedar, experimentation is more than just a buzzword; it’s a critical process that empowers the company to make informed decisions. Whether it’s optimizing user experiences, refining product features or enhancing business strategies, experimentation forms the cornerstone of Cedar’s growth.

We prioritize A/B testing on smaller groups to enhance the patient experience before implementing changes on a larger scale. At the center of our experimentation process is Strategizer, a tool that executes actions based on predefined conditions and stores the outcomes in a structured table format.

Here are the typical steps in a Strategizer experiment:

Enrollment Logic: Strategizer assesses predefined conditions to decide if a user joins the experiment.
Random Assignment: Users chosen for the experiment are randomly assigned to an experiment ‘arm.’
Experimental Treatment: Users receive their assigned experimental treatment, and the metadata is tracked in the user table.

However, managing experiments effectively requires more than just running A/B tests. It demands a robust aggregation system that can handle the complexities of data collection, segmentation and analysis.

The need for an Experiment Results Framework

In 2022, one of our company key results focused on improving patient collection rate by 5% across our clients. As a part of this initiative we launched around 40 experiments, each aimed at improving the experiences of patients and creating value for our clients by testing different hypotheses.

The increased volume of experiments posed specific challenges. I therefore set out to create a new and improved Experiment Results Framework pipeline, which would serve as the central repository for all experiments conducted at Cedar, bringing clarity and efficiency into our endeavors.

The challenge of data analysis

Before the framework, Cedar faced several data analysis challenges:

Data fragmentation: Experiment data, such as user engagement metrics and individual billing details, were dispersed across multiple sources, making it challenging to access and consolidate for analysis.
Individual custom queries: Analysts crafted unique SQL queries for each experiment, sometimes resulting in inconsistencies, divergent interpretations and resource-intensive processes.
Increased data volume: As we conducted more experiments, the volume of data for analysis grew, lengthening the data retrieval time.

These challenges underscored the need for a centralized solution, paving the way for the Experiment Results Framework.

Enter the Experiment Results Framework

This framework relies on three fundamental building blocks: abstraction, experiment settings and aggregation.

At its core, this framework embraces abstraction. It employs a base class that houses commonly used queries for reuse. With this structure in place, data scientists can create child classes for specific experiments, inheriting and customizing these queries. This approach saves time and ensures data analysis consistency.

class BaseClassQueries():
      def common_query_1():
          '''define your common queries here'''
      def common_query_2():
          ''' define your another set of common queries here'''

class ChildClassQueries(BaseClassQueries):  '''
Define the standard base queries here when overriding the logic; otherwise, there's no need to rewrite the common queries.
'''
       def child_class_query():

The Experiment Settings offer a level of customization, enabling us to specify the function that generates the query, the number of days the experiment should run before analysis, the experiment ids to include and much more. This flexibility empowers the framework to be adapted to the unique needs of each experiment.
The classes described above query different base tables in our Snowflake data warehouse using the robust Python library SQLAlchemy. They efficiently retrieve and transform data into meaningful, aggregated views. Each task is then managed and scheduled using Airflow, which later writes the aggregated views into Snowflake. This streamlined approach eliminates the need for data scientists to repetitively query different tables, as they now have an aggregated view with most of our key metrics, which facilitates rigorous data review processes and ensures faster data retrieval, significantly enhancing our workflow.

There are several other benefits to implementing this architecture:

Code Reusability

One of the standout features of the Experiment Results Framework is its emphasis on code reusability. By providing a structured template for base queries, it empowers Cedar’s analysts to write cleaner, more efficient code without re-writing the same code again and again. This not only saves time but also ensures consistency across experiments.

Scalability

As we launch more experiments, the framework’s design makes it easy to read in new experiment data or create a pipeline for experiments that have key differences in the way they are run. One example is our discount experiment, which uses a machine learning approach that requires a very specific type of logic for including invoices, based on account balance. The scalability of the framework enables us to use existing base queries, and write customized queries to include invoices that were assigned at the onset of the experiment facilitating the analysis of user behavior during the initial phase.

Readability and Maintainability

The clarity of code is one of the important considerations when dealing with complex experimentation data. The framework’s structure enhances readability, focusing on how easily human readers can understand and interpret the code and modify aggregation logic as needed. This, in turn, improves maintainability and reduces the risk of errors.

Bottom line

Implementing this consolidated aggregation pipeline has produced tangible benefits. After analyzing the aggregated data for the experiments supported by this framework, we saw that some experimental features had positive impacts on collections, some were less favorable and others exhibited no change. This information guides us in deciding what features to roll out to our full population.

*Illustrative diagram showing how Cedar uses experimentation to optimize the patient billing and payment experience*

The framework has helped us enhance our understanding of collection rates while reducing the time spent on tasks like adding new fields.The pipeline plays a crucial role in the creation of informative dashboards that provide continuous insights into our experiments, enabling us to track metrics like the percentage of users engaged digitally at the 120-day mark and many more.

Furthermore, the framework captures the outcomes of diverse users enrolled in various experimental treatments, providing a comprehensive understanding of how different strategies affect user behavior. The data generated by this pipeline serves as valuable training data for our machine learning models as well. To explore how this data has been harnessed for the optimization of user interactions, check out our recent article on Machine Learning-Powered Discounts.

This journey has not only expanded my own horizons but also underscored the importance of having a data-centric framework to understand experimental results and make meaningful decisions. I am excited by the continued development of our experiments and our unwavering commitment to enhancing healthcare experiences.