From the Quantified Self to Epidemiological Research

It’s been almost a year since my last post here. In the past ten months we made good progress with HRV4Training, finally publishing our work on the relation between physiological data such as heart rate and heart rate variability (HRV) and training on our user base — one of our early goals.

As a scientist, my first goal with HRV4Training has always been to be able to bring a practical, easy to use, clinical grade tool in the hands of thousands of people. Finally allowing us to study complex relations between physiological data, lifestyle and training, beyond what can be done with regular clinical studies.

This post is about how I see the digital health space, and the approach I’ve taken in the past few years while moving towards that goal; from N=1 to epidemiological research. It’s about a personal journey and by no means it is a comprehensive analysis of what other companies in the space are doing. Yet, I hope these points can trigger some reflection and help channeling efforts in the right direction, so that eventually research & people can benefit.

Full paper at this link: https://www.researchgate.net/publication/301998958_HRV4Training_Large-Scale_Longitudinal_Training_Load_Analysis_in_Unconstrained_Free-Living_Settings_Using_a_Smartphone_Application

N=1 and the Quantified Self

N=1 and the Quantified Self are about self-experimentation. From my perspective as a scientist or PhD student at that time, self-experimentation was an attempt to understand first hand more about what I was doing and the problems I was trying to solve.

Preparing for a conference. In 2012 I wore the ECG Necklace and sticky gel electrodes for almost the entire year, day and night. Don’t try it at home (I have permanent marks).

In my PhD I was working on machine learning methods for activity recognition, energy expenditure estimation and cardiorespiratory fitness estimation using wearable sensors data. Being able to wear prototypes and go through the raw data informed much of the decisions I made later on while designing studies for my own work.

I became more aware of all the factors affecting physiological responses and started developing methods & models able to deal with some of these issues (for example heart rate-based energy expenditure estimation which can account for differences in fitness between individuals, thus being more accurate during exercise).

Similarly, in the early days of HRV4Training I was investigating the relation between a few months of good injury-free trainings and performance, without making any adjustments, just observing how physiology changed as training load and fitness changed (full story here).

How do you go from N=1 and the Quantified Self to developing new insights effectively carrying out large scale epidemiological research?

My approach is based on three main pillars: 1) scientific validation of your N=1 technology, 2) being able to contextualize your data (a.k.a. you need the right reference points [it’s not an app, it’s a massive clinical study]) and finally 3) engagement. Once you have these three, you can start exploring.

Scientific validation of your N=1 technology

If your long term goal is to move the needle in terms of using your app or technology for research, it has to be validated. Otherwise, everything built on top of this technology later on will inevitably be flawed by shaky foundations. Oddly enough, it’s very hard to find any validation of most digital health products released in the past 5–10 years. No scientific publications, no white papers, no blog posts. We typically have to wait for a third party to get a bunch of sensors and a reference system and show us how things actually work. Maybe this is part of what is causing much skepticism around digital health these days. Why would I trust X if Y showed that the product Z has nothing to do with actual {what you care about: calories? heart rate? glucose? blood pressure? sleep stages? etc.}.

Gaining trust from the community can be as simple as detailing your work in a blog post, showing how you compare to the gold standard, and maybe even providing some data and additional details on how the technology was developed.

We started that way, blogging about how to get HRV out of an iPhone camera with the same accuracy as a chest strap (full story here) and helping others with similar problems (see stackoverflow). We later on teamed up with top scientists in the field to broaden the scope of this validation (more people, different settings, more reference systems). This way we slowly built trust on our work.

Comparison of a full ECG and PPG processed by HRV4Training, including detected peaks. All peaks (ECG and PPG) are perfectly aligned, showing how a phone camera can replace an ECG (or a chest strap) for HRV analysis in healthy individuals.

Contextualize your data (a.k.a. you need the right reference points [it’s not an app, it’s a massive clinical study])

While the previous point (clinical validation) is something everyone developing technology is aware of (?), contextualizing your data and collecting the right reference points is often overlooked.

In the era of big data, the understanding is that you just need to collect a lot of data and at a certain point magic will happen. Wrong. Collecting the right reference points should be your main concern. Study design is one of the most important and often time consuming parts of a clinical study, and for good reasons.

In HRV4Training, reference points are called Tags. Without tags, we have nothing. As important as physiological data is, when de-contextualized it means very little. Once you start tagging your measurement with information about your workouts and lifestyle, we can start making sense of it. In a clinical study, you would have participants answering questionnaires, run tests, etc. all data that you would then use to analyze whatever you are measuring in the context of the relevant outcome.

Digital health tools need to be able to gather similar reference points and need to be designed with that in mind, so that an app can finally turn into a massive clinical study.

In the publication above, we analyzed the relation between physiological responses and trainings of different intensities. More recently, we blogged about these relationships in individuals with different training loads and performance. Without the right reference points (e.g. subjectively annotated trainings, imported workouts from other services, etc.), none of these analysis would have been possible.

Relation between heart rate and training load in HRV4Training users, showing a clear reduction in heart rate for individuals that train more, across sports with a strong aerobic component. More stories here: http://www.hrv4training.com/blog/heart-rate-variability-performance-a-cross-sectional-analysis-of-hrv4training-runners

You can certainly get creative on how to get the right reference points (other sensors, APIs from other services, separate emails, in-app questionnaires, etc.) — the point being that it’s important to try to ask the right questions since the beginning. What problem are you solving? How are data from thousand or millions of people helping you? Answering these questions will point you towards the reference points that will eventually be key in doing epidemiological research.

Engagement (Marketing?)

A clinical grade tool and a perfectly designed app will still get you nowhere in terms of advancing research, if nobody is using your tool. While I am no marketing expert, the digital health space brought some unique opportunities to create engagement and a solid community around your work.

For many technologies we are still talking about early adopters. A few thousand users that really get it and are willing to help you to get the most out of the tool. They can be highly engaged because of different reasons (athletes aiming at achieving peak performance, users affected by a specific health condition, or maybe it’s just that period of your life). The more information we can generate to help understanding and interpreting these data, the tighter the community becomes.

When research is the goal, the opportunity comes from simply being transparent on what you are doing and how you are doing it. As a scientist I always had little interest in protecting my ideas, and blogged about the very unique technology that enabled all of our following work as soon as I had finished developing it, more than two years and a half ago (you can find it here).

Views on the HRV4Training Blog in the past 12 months.

As we now continuously blog about new findings using user-generated data, we have users that go as far as using the app only to provide us with data that we can use for research (thank you for that).

Being transparent, blogging about all the details of our work, helped showcasing what we are trying to do, creating great relationships with others, competitors included, and got users more and more engaged with our work.

HRV4Training reviews. Science rocks.

Closing remarks

These are exciting times for everyone working in the digital health space. We can collect and analyze data in ways that were unthinkable just a decade ago. User generated data is powerful and this is a trend we are seeing more and more, especially with the release of ResearchKit or CareKit in the past two years, and the involvement of big companies like Apple.

The main idea is simple. Instead of running expensive clinical trials on a limited number of participants, we can try to provide clinical grade tools to users/consumers, and acquire a much bigger dataset that can provide a better understanding of the relations between our outcome of interest and other variables.

However, a sound scientific approach is often lacking, making it hard to trust and benefit from most tools. In my personal experience, I developed an approach that I tried to outline here, going from personal insights to the validation of N=1 tools, that can be finally released in the hands of thousands. With the right questions, reference points & engagement, we have a unique opportunity to move the field forward.

Aiming for a massive study with many users does not mean that we forget about the individual. Eventually, we need to be able to bring this information back to the N=1 to provide better insights or care.

As we gather data from more people, we can also potentially stratify based on many more parameters, meaning that instead of having homogenous groups of subjects hoping to see similar physiological responses to a specific intervention (e.g. in a clinical study), we can factor in many individual differences and characteristics, due to the much bigger sample size.

We are trying to implement a similar approach at Bloom, where we developed (and validated) the first wearable sensor able to monitor uterine activity during pregnancy. We have many questions we are now trying to answer, from understanding what’s behind pregnancy complications to the early signs of labour, but that’s another story :)

Heart rate changes during pregnancy, about 7 months of data.