Designing a product to coach runners whitepaper

Published in

Byrd Run Club

19 min readFeb 8, 2021

An image of a runner holding Byrd running app showing their upcoming runs. A mountain is in the background.

Preface

Theory is boring, running down hills is much more fun. But we think it’s important to lay out the underlying philosophy of Byrd to introduce to people who are looking to collaborate with us. This document captures the initial theory, assumptions and biases that underpin Byrd and how it then works as a product in practice.

This whitepaper began life as an internal document within Byrd. We used it collectively to make design and development decisions. It has morphed and changed over the past year. At one point we rewrote it entirely to better combine how we understood the interaction between physiology and psychology.

It is not normal to share internal documents externally, but then Byrd isn’t a normal company. We’re interested in making the running experience better for runners, nothing less nothing more. To do that we know we need to ideate, iterate and fail. Having the whitepaper as an internal document means that it’s never going to fail. Internally we’re unlikely to spot any logic faults, non-sequiturs and blindspots that it may have. We will miss out on knowledge we’ve not yet had the chance to acquire.

This is a living document that’ll be updated as we learn new things and work with more people to create our coaching app for runners. Be brutal, we can only learn from criticism!

Version: 0.5.1
Updated: 08 February 2021

Running as a design process

Byrd was started as a design problem to solve. We’re using the word design here in the sense of how something works, rather than how it appears. As Raymond Loewy said, “Good design is not an applied veneer.”

Design, as empirically measured, is similar to running. In design, like running, a grand plan will fail, a one size fits all plan will fail, a plan without some sort of goal will fail. Design works by making iterative, slow, imperfect progress. It is non-linear, even after years a talented designer may produce poor work. In the domain of design we talk about this being the action-centric model, in running it’s our lived-experience.

Running is a skill that needs to be mastered and learned. Like music or language it requires repetition and near daily practice to see progress. If we’re going to make it a daily activity it needs to be something that brings joy. Too often running is advertised as being something that’s hard. This is incorrect. It can’t just be a chore that we have to get done else it’ll never become a habit. Any system to support runners then needs to be designed in a way that allows the runner to escape the daily grind of life, instill knowledge and confidence, and allow the runner to grow as an individual.

Runner centered progress

Reductively, within pedagogy the learner-centred education theory is based on the idea that the most successful outcomes are those that come from giving autonomy and independence to learners. That is, a student-defined outcome is much more likely to be achieved than one imposed by an external authority. Beyond pedagogy this is echoed by research in design thinking, psychology and behavioural economics where giving more autonomy to the user in any given setting tends to greater success for that individual.

This resonates with how runners think of progress. Running’s an intrinsic sport where the only person who can validate success is the individual. For Byrd to succeed it has to lean on that self-motivation where progress itself is an incentive. In this way the product is an extraneous support to facilitate habit forming by giving positive reinforcements. Byrd pushes progress by introducing new experiences and giving immediate feedback.

Facilitator and advocate

Byrd aims to create stronger runners by nudging them to get out running. Once they’ve been running we can then remind them how awesome the experience was. There’s a number of tendencies from different domains that this relates to. In psychology it’s often referred to as the “Habit Loop”, a phrase popularised by Charles Duhigg. In Behaviour Design it’s Fogg’s Tiny Habits model. In design thinking it’s Don Norman’s action model. It can be summarised as “Trigger, behavior, reward”.

In Byrd the trigger, behaviour, reward loop is started by generating runs that are uniquely tailored to the runner’s individual circumstances. The trigger is the planned run, and an explanation about the ‘why’ of that planned run for the runner. The behaviour is obviously them running. The reward is reflecting the run back to the user. This reward takes a number of different forms depending on the context of the user or the context of their run. The reward needs to reflect the magnitude of the behaviour. If they’ve just smashed their PB then every firework we can set off should be set off. If it’s just an easy run where they’ve stuck to the correct pace it’s perhaps just a small animation with an explanation of how the run went and what that means to how they’re progressing. If the behaviour isn’t the expected behaviour, i.e. the user deviates from the planned run, the presumption is that it was intentional with Byrd feeding back in a neutral way to the user what impact the run has had on their training.

The post-run messages then are reflected back in a non-hierarchical way. That is, as Alfred Adler might say, the messages neither praise nor admonish. Byrd will never say a run was ‘unproductive’ and will equally never tell a user that they’re special. We do this to ensure that we’re runner centred with the individual user always being the person in control of any human-machine interaction.

Finding flow

For a user to progress in any domain learning needs to be paced in a way that matches their lived experience: it can’t be too hard else a user will burnout, it can’t be too easy else a user will rust-out. Within psychology, being reductive, the space in the middle is a state of ‘flow’ that was first described by Mihaly Csikszentmihalyi. Flow state is a point where a user is fully immersed in an activity with a challenge and skill level that matches the user’s abilities. The way that Byrd progresses the user, at whatever level they’re at, is to support getting towards flow state within their running.

Byrd deliberately spaces runs to try to synchronise with the user’s stress / super compensation rhythm. This is an art not a science because every individual is different. This is why Byrd additionally offers the user the ability to input their perceived effort, and give further information about their experience of the run. Those data points are useful for Byrd but their main utility is for the user themselves so that they can progress in how they understand their body’s response to exercise. This understanding either reinforces, or introduces, longer term codification of running knowledge.

Avoiding recipes

Humans are very good at spotting patterns. We’re so good at spotting patterns that we often see them where there aren’t any (e.g. clustering illusion and other pattern biases). Humans are also very good at believing they’re uniquely different to everyone else (hence the many egocentric biases). This creates a conundrum for any product trying to get someone to progress towards a goal, whether that’s mastering a new language, or trying to get a new job. To progress, a user, by definition, has to engage in repetitive behaviour. This is compounded for Byrd by the fact that running is itself inherently repetitive as a sport. If it is too repetitive though it’ll become demotivating and appear to be a ‘one-size-fits-all’ plan.

Byrd adapts to the user’s training primarily based on the user’s interaction with the plan. Their plan will become more difficult if the user picks a more challenging goal. Their plan will reduce the runner’s distances if they take a break. The plan will increase the runner’s paces if we believe they’re fitter etc. After every run, or everyday where the runner hasn’t run, Byrd will evaluate where the runner is and will calculate what changes need to be made for the runner.

If a runner were to make no progress they would still see changes in how their plan responded to their running. As with the seasons and weather in the real world Byrd will visually change through the year, reflecting their running context and experiences. We do this through image, illustration and messaging. Additionally within the plan generation there are slightly randomised numbers that have no material impact on training but that give the impression of the plan being more curated. From our testing seeing three runs that are 37 mins, 41 mins and 42 mins has more resonance with a runner than three 40 minute runs even if they’re materially the same.

Plurality of goals to improve outcomes

When Byrd was first conceived we only offered a single goal type, which related to racing. Think of the traditional plan where it’s offering the runner a chance to run a sub-90 half or a sub-4 marathon and that was what we were building. Talking with runners, in focus groups and one-on-one, whilst reflecting on our own running habits made us realise that this didn’t adequately account for how runners set goals.

People take a pluralistic approach when they think about goals.

Building the product out we have started to reflect this plurality. Performance goals are now broader than just allowing a user to target a certain distance in a certain time. For many the performance goal may simply be completing a new distance. We have introduced cumulative goals that allow the user to track weekly, monthly and annual distance, elevation and duration. And we have introduced experience goals.

These top-level goals are broken down into smaller components. Each performance goal contains a series of ‘epics’. If a run is relevant to an experience goal that information is reflected back to the user. If a run contributes to the runners hitting their cumulative goal it’s reflected to the user. Being reductive we have taken elements of game design — points, bonuses levels, boss levels etc — to reinforce positive behaviour. This breakdown echoes the assertion of BJ Fogg, the Stanford based Behaviour designer, that habits are maintained when they’re regularly celebrated. But it also echoes how runners approach running when it becomes tough. In a race we think about holding the pace for the next kilometre, or making it to the next checkpoint, since we’d be overwhelmed if we thought of the whole.

The aim of this is to reinforce the ‘little and often’ thinking around attaining goals. That is, progress is likely to be non-linear, but the best way to progress is to keep getting out the door and going for a run. By having a plurality of goals we give more tools to the user to motivate themselves to get out running. If one goal feels unattainable, their performance goal time for example, a secondary, or tertiary goal, may be the thing that gets them out the door. Progress isn’t linear and the ability for users to keep moving forward is critical to facilitating their progress.

High specificity goals to facilitate outcomes

Beyond plurality though Byrd also offers some goals that are deliberately high in levels of specificity e.g. “I want to run the Bristol Half in 74 minutes 30 seconds”. Generalised goals have lower success rates. Running 1,000 miles a year has a lower intrinsic resonance than removing a number of minutes from a goal time, or being able to visit a location they previously thought was unattainable. Running 1,000 miles is a means not an end. It returns to Byrd taking a self-determined approach to learning where, if a runner is intrinsically motivated by a goal, they’ll want to conquer the goal to receive internal rewards, and will be satisfied because of it.

Beyond specificity the goals on Byrd are both challenging and attainable. Finish times, or distances, offered are based on the user’s current running state and a confidence scale attached to outcomes. Edwin Locke et al, within phycological studies in the 90s, showed that with goal setting attainability and commitment are key components. This makes intuitive sense, and echoes Csikszentmihalyi’s concept of flow. A goal needs to be challenging but not impossible to allow someone to achieve it.

Personalisation to improve outcomes

Personalisation takes many forms. Repeating the user’s name, echoing their location and reminding them of their experiences are all powerful forms of personalisation. Byrd, initially, reflects the weather, time of day and location that the user has run within. We allow the user to collect experience goals that are relevant to their geographical location and the times they go running. The product does this because, as economists like Kahemann and Tversky have shown, or phycologists like Anand and Ross, having ourselves reflected in this way leads to the individual trusting the product more and being more invested in the outcomes. Those conditions are likely to lead to the user having more success in achieving their goals with the product.

Personalisation extends to the modality of our progress as runners. That is, how intensely do we want to train and where, attitudinally, are we within a goal cycle. Byrd frames this through five different training modes — Performance, Maintenance, Build-up, Holiday and Off — which will impact intensity of training and how the product talks and interacts with the user. If the user is attitudinally on holiday there’s no point giving them anything mentally or physically stressful because they’re not going to successfully complete it.

Beyond modality it also extends to empowering the user to make day-to-day decisions about how they want to train. We learnt this the hard way. Initially Byrd, once a user was running a certain amount and targeting a certain goal, would add double days to a plan by default. It is physiologically the most efficient way to improve — since it balances stress and recovery more frequently — but it doesn’t necessarily fit people’s daily commitments or their mental model of running. The presence of speed sessions can have an equivalently marmite response. We recognise giving these options to users may inhibit their progress but trust a user to be more likely to make the correct choice than an AI system that’s blind to many of the user’s lived experiences.

A human could do this better

There’s extensive research showing that the best extrinsic force to maintain motivation is another person. This can be negative motivation — a micromanaging parent for example — or positive -. your running buddy or running club group. It’s why running clubs are incredibly good places to go if you’re wanting to push on as a runner. Byrd is designed as a complement to these relationships, as a facilitator, not a driver.

People may not have access to friends who share their interest, may not have access to a running club, may not have confidence or the spare cash to hire a coach. Byrd gives safe, non-judgmental space for making progress.

Because we recognise the importance of human interaction towards progress in the future we have plans to allow micro social groups within the product and are working with coaches to improve and iterate how Byrd interacts with users.

Storytelling to aid progress

Byrd uses narrative extensively and uses Natural Language Generation to create stories that are contextually relevant for the user and their running. As referenced earlier this reflection is part of the ‘habit loop’. We use stories as they’re much easier to remember than statistics. They resonate on a visceral, rather than cognitive, level and allow a relationship between the product and user. It ties to a trend within design towards giving qualitative feedback to the user rather than burying them under a deluge of data.

Storytelling is also used as a means of disclosing information in a way that allows the user to focus on a single piece of information at a time. If — taking a random example — Byrd wants to talk to the user about cadence, goal day setting and the pace consistency of the most recent run it would be impossible to do on a single screen. These need to be clearly separated for the user to allow them to get the most out of the information and enable them to take the appropriate action.

Beyond aerobic fitness towards holistic fitness

The majority of research over the years looking at endurance sports has focused on aerobic fitness. It’s the classic running-to-exhaustion treadmill test that ignores the fact that humans aren’t machines. Over the last decades research showing the importance of other systems has started to be published, such as Tim Noakes’ Central Governor model. Researchers and authors like Christie Aschwanden or David Epstein and coaches like Shane Benzie have also started popularising the interplay between different systems.

At Byrd we’ve taken the opinion that there are five systems, or processes, that are relevant to running. Specifically: the mind, aerobic fitness, skeletal system, fascia system and muscle relevancy. These systems are used differently by a runner depending on their goal(s) and their attitudinal state. To be explicit: someone trying to tackle a fast 5K is going to find it difficult to hit top speeds if they’re also training for a hilly ultra.

Byrd can only track a user with their running data. This makes some of these elements very difficult to track directly. We use a number of adversarial algorithms alongside pattern matching algorithms to track them but have ambitions to go further in the future. We augment this data, and resolve fuzzy data that’s difficult to match, by talking directly to the user with dialogue flows.

It would be easier if we just used the Training Stress Balance model or the Bannister model to coerce aerobic stress to a fitness score. However, we’ve seen through testing that running, and modelling human behaviour, has more complexity than that and so need to ensure we’re tracking more data points.

Excluding heart rate data

Heart rate data is so commonly used to measure stress that it’s almost received wisdom that it’s essential to track running progress. Our exclusion of heart rate data, initially, came from the inability to rely on the accuracy of heart rate monitors. Optical wrist monitors are notoriously unreliable and will change depending on the weather, altitude or wrist shape. As designers, working with data, we know that once incorrect data has polluted the dataset it’s impossible to trust. Returning back to the philosophy of runner-centric technology a heart rate monitor also removes a lot of agency and knowledge from a runner. Running on feel is an essential part of running if someone is going to get to a state of flow within their practice.

Byrd depends on reported perceived effort. It is generated programmatically but is editable by the user on the basis that completed data is more often edited than data is added. The user moving their perceived effort allows Byrd to start conversations with a user. If someone indicates what should have been an easy run was hard then Byrd can have a certain confidence that we need to set slower paces for that user.

Stress and supercompensation as proxy for progress

Within human physiology there is an echo to learning psychology around the amount of physical progress an individual can make. As with learning our bodies need to make small incremental progress based around the body’s adaptation to stimulus. To be hugely reductive when a person undertakes any sort of physical activity their body will be put under stress. If that stress is beyond a certain threshold the body will go through a process of adaptation. It’s the concept of ‘little + often’, or as the coach Steve Magness puts it ‘Stress + Rest = Growth’. For convenience:

Stress: the act of going out running
Recovery: where the body returns to the starting state before exercise
Supercompensation: where the body will build muscle to compensate for the stress
Regression to mean: where the body, favouring stasis over all other things, will regress back to an earlier state of fitness if no further stress is introduced converting muscle to fat etc.

There is a reasonable body of research around supercompensation. The holistic interplay between different systems though is complex and we’re not proposing that Byrd is tracking these. We consider supercompensation in the way that Daniel Kahneman talked about the mind being divided between System 1 and System 2 thinking. It’s a convenient shorthand to talk about a concept. We express this graphically back to the user so that they can track their progress.

Nobody cares about sports science. They care about how they feel.

None of this is important to the end user, unfortunately. We’ve learnt this the hard way through testing. As a result we don’t present it to them in the product and we work hard to avoid them seeing any hard edges caused by bringing together physiology, psychology and design thinking.

No one is running for some unknown future health benefit. We’re running because it gives us an escape from the chaos of the world, gives us the satisfaction of mastery and let’s us have the satisfaction of a daily adventure. Byrd is designed to facilitate that adventure.

Nice philosophy, where’s the maths?

Byrd uses a lot of mathematical models to work. Philosophy is what underpins them: artificial intelligence isn’t neutral and software is steeped in the bias of whoever wrote it. This document was written as much for us at Byrd to ensure that we have a roadmap around how we want to develop technology to support runners not just create technology for the sake of it.

But to the maths! Byrd has a few different layers of algorithms that track a user’s progress. We track the user’s volume (both distance, duration and elevation), which allows us to track acute and chronic load and how much future volume they could have added or, perhaps, removed. We track their adherence to the training plan and in particular how well they’re able to maintain their pace during the more challenging sessions. From the perspective of how well the plan is working for a user, being able to see that they can keep up with a two-hour progression run, for example, is fairly instructive. We also look to model the user’s aerobic capacity mathematically. We track this with three different algorithm groups:

A single event, as shown by Jack Daniels and Pete Riegel (in separate studies), whether that’s a race, epic or training session
A single period of time, such as a month or 12 weeks, as shown by Giovanni Tanda
A continuous period of time, as shown by Eric Bannister and Andrew Coggan (in separate studies), or the very common TSB (Training Stress Balance) model

We use adversarial algorithms to arbitrate between these individual tokens and assign a confidence value between them. We’ve developed these algorithms based on observational running data. We expect, as Byrd develops, that we can refine and improve on them. Different goals have slightly different weightings as to which algorithm group to give most confidence. Overall we’re seeking to find a balance between the volatility of an individual run against the inertia of a token generated across a large period of time.

Many of Byrd’s calculations depend on computing the stress, or training impulse, of the user’s run. Our aim is to ensure that stress is based on intensity rather than simply being based on distance or duration. Runs that have different distances may have the same stress when elevation or weather conditions are included. Consider, for example, a 13km run in 30°c, 80% humidity with 500m elevation gain against a 27km run in 15°c, 60% humidity with 100m elevation. They have very different distances, but they’d probably create a very similar stress response for a run.

In addition to relying purely on computation Byrd regularly initiates feedback loops with the user. The most common of these is the user reporting perceived effort, but can be augmented depending on the deviation from the expected behaviour (e.g. the user ran substantially slower, or faster, than expected) or the importance of the event (e.g. finding out whether a user found an epic easy or hard). If a user amends their perceived effort Byrd will, in turn, change their stress score from that run.

Game theory applied to training

Economic theory would argue that running can be modelled mathematically in the same way that any other human domain can. Every decision is a cost-benefit analysis and every run, or non-run, is an opportunity cost. Whatever decision is made — or not made — can be modelled using decision trees.

Reductively game theory is a maths discipline that’s been around since the 1700s but really rose to prominence after the Second World War. It initially looked at zero-sum games, that is those games where there was a winner takes all. It is very far out of the scope of this whitepaper to go deep into the intricacies of game theory however it gives some very useful concepts for Byrd to use.

A simple way to look at this is with chess. In chess each piece has a nominal value in any given location on the board. That nominal value can be changed depending on the context of other pieces. So in practical terms a rook that hasn’t yet been moved has less value than a rook in open play. A rook that can move to check the opponent’s king has a higher value than one with no possible offensive move. With that knowledge it’s possible to create a simple grid of what the ‘cost’ of each possible move might be. That information allows us to create a minimax algorithm that will recursively evaluate all the decisions to a certain depth. The minimax algorithm can be made more performant by a method known as alpha-beta pruning. Alpha-beta pruning will discard any branch of the decision tree where the next move leads to a work situation than a previously discovered one. Reductively, this is what allowed DeepBlue to beat Kasparov in 1997.

Running — outside a race — is clearly not a zero sum game but these same techniques can be applied to running when creating a plan for the user, or choosing their next run. Recursively calculating what the best possible outcome would be for the user — as in, how do they hit their goal — allows Byrd to pick the most performant next run given all of their options.

Matching patterns to understand what a user has done

It would be very boring if every time a user went running they had to record what they believed that run was. In some scenarios they might not know exactly what the run was. However, for Byrd to function correctly it has to understand whether has done the expected run they’ve been given.

Fortunately running is quite predictable. If a user goes out and runs 12km at their base pace for all of the laps we can say with a very high level of confidence that the person has just run a base run. Likewise if we see several slow kilometers followed by 18 cycles of 4 minutes fast, 2 minutes slow before seeing another few slow kilometers we can be very confident that they’ve just been on an interval session.

We pattern match to help a user visualise their previous training, alongside using it when considering the stress, points and generated message for the user. We also do this to avoid training monotony when generating future runs. We talked earlier about how good humans are at spotting patterns. We deliberately want to ensure that a user is given a different session, even if it has exactly the same stress impulse as another one, to make a user feel individual and like their plan is changing.

References

The whitepaper currently does a poor job of naming sources, given it’s continually in flux it seemed to be a case of premature optimisation to put them in at this stage. We will update and improve in future iterations. In the meantime either Edd (edd@byrd.run) or Nina (nina@byrd.run) would be super happy to fill in any holes that have been left by the lack of references.