Creating appropriately difficult challenges

Secrets for Generating the Goldilocks Condition (not too hard–not too easy)

How to motivate people to keep playing health games long enough to produce benefits

16 min readMar 16, 2020

Secrets of Creating Effective Health Games — part 3

Finding the Perfect Balance

It takes time for digital therapeutic games produce measurable results … just as going to the gym once doesn’t get you in shape. Players need to complete multiple sessions of the “gaming therapy” to accumulate the small improvements of each session. One of the best ways to motivate someone to keep coming back is to pose an appropriately difficult challenge … a game level whose difficulty matches the player’s current ability.

Woman tentatively snowboarding — Fear of failure can prevent flow

When a challenge is too hard, fear of failure can interrupt the focused flow state, making it even harder to face the challenge. Millisecond by millisecond, the player’s brain fights a losing battle as fear siphons away more and more of their attention. Tension mounts and players will often quit the game to escape their frustration.

Woman looking bored while playing chess — “Too easy” games can be boring

Conversely, when a challenge is too easy, the player’s mind intermittently drifts away from the task, dropping out of the desired ‘flow state.’ Their divided attention eliminates the benefits that focused effort can produce. After several boring experiences, most players will stop playing and find something more engaging to do.

Creating an appropriately difficult challenge for one player is hard enough but given that a collection of players will have different levels of ability, the game must be able to present a range of appropriately difficult challenges to meet these diverse needs. And as any player gets better, the game needs to give that player a harder challenge to keep them motivated (players “skill-out” of any one game challenge).

Winning a computer game should be like winning a tennis match 7–6, 6–7, 7–6. And as you get better, your opponent gets better too.

Trip Hawkins, Founder of Electronic Arts

Graphic showing player trajectories on a graph of difficulty and ability

A health game designer’s challenge can be illustrated by a diagram that plots game difficulty with player ability. The goal of a health game is to guide a player, over the course of several weeks, along a “flow-focused” climb to higher levels of game difficulty. Each person will have their own starting point and trajectory within the game, further complicating the task of creating appropriately difficult challenges for broad audience.

Graphic comparing two methods of creating appropriately difficult game challenges

The two most prominent methods for handling this logistical challenge are Difficulty Progression and Closed Loop. Both have their strengths, which provocatively match with the other’s weaknesses. We’ll cover their merits separately and then explore a promising hybrid approach.

Difficulty Progression

This design approach, a common tactic when designing digital entertainment games, involves creating a sequence of progressively more difficult levels. Strong players will coast through the initial levels, rising until they reach one that matches their ability. Weaker players may feel challenged by the second level (the first level is the “everyone wins” or tutorial level that boosts player confidence).

Menu screen for Balance Games therapy program — Menu screen for Balance Games therapy

The benefits of this design approach became evident in a series of mini-games my former company, Red Hill Studios, created with Glenna Dowling and Marcia Melnick from the School of Nursing at UCSF to help people with Parkinsons’ disease improve their gait and balance.

Screen shot from Rail Runner health game — Screen shot from Rail Runner mini-game

One mini-game challenged players to stand up and sit down to power a virtual railcar through a 3D scene. The game was modeled on a sit/stand exercise that was part of a successful non-gaming therapy developed by Dowling and Melnick.

When the player stood up, the railcar handle moved up; when they sat down, the handle moved down and the railcar moved forward along the track. The more the player stood up and sat down, the farther they moved along the rails through a 3D scene.

We tested the prototype with people who had Parkinson’s to determine how many repetitions of standing up and sitting down they could do initially and then estimated what they might be able to take on after playing the game for several weeks.

Diagram of difficulty staircase for Rail Runner game

Based on these tests, I created an initial version of the “difficulty staircase” for this mini-game. The first level was designed so that nearly every patient would be able to complete it (“easily assured initial success.”). Another consideration was that, for appropriate safety reasons required by our clinical partners at UCSF, Parkinson’s patients would always start at level 1, regardless of how high a level they had reached in their last session.

I set an ambitious goal for the top level since that would be the hardest difficulty for the 12 week study and we wouldn’t be able to add levels mid-study if the games were too easy. I was confident from earlier projects that at least some of the participants would climb their way to the top level, if not succeed at it. But when I sent my design off to UCSF for their feedback, I got a quick and sharp reply from Judy Mastick, who was recruiting the patients for the clinical study, and a healthy debate.

“Bob! These levels are much too hard. There’s no way someone could handle level 5. These people have Parkinson’s disease!” Judy responded.

“I know it’s challenging but they get to choose which level to play,” I replied. If no one gets to the top level, that’s fine. But I think that over 12 weeks, a lot of them will climb the staircase.”

“You’re dreaming, ” was Judy’s reply.

“Well, I’ll bet you a lunch that at least half of the subjects will get to level 5 on at least one of the mini-games!

“Ok, I’ll take that bet,” Judy responded confidently.

We tracked player performance in near real-time as they performed the games in their own homes (we uploaded game data to the cloud at the end of each session). An interesting pattern emerged as players chose which levels to play (remember, they always had to start at level 1).

Player progression on difficulty staircase, gradual climb to harder levels — Typical “climb” up a difficulty staircase

Most of the patients failed the first time they tried a harder level. They would return to the comfort and ease of the game level they just game from. But they would soon get bored with the easier level and return to the harder level … and they would eventually win! The pattern then repeated as they got stronger and steadily climbed the difficulty staircase higher and higher. Step up one level: lose: go back to previous level: try harder level again, win!. Within the first few weeks of the clinical study it became clear that I’d win the bet!

Graph showing 180% average increase in exercise for Balance Games patients — Players increased exercise by an average of 180% over 12 weeks

Overall, the difficulty staircase encouraged most of the patients to challenge themselves with harder and harder levels. In fact, they increased their amount of exercise by an average of 180% over the 12 week gaming regimen. Their balance and gait also improved a significant amount through their diligent efforts.

Useful Design Tactics

Designing a difficulty staircase progression requires a lot of prototype testing with “unaffected” players (people without the disease you’re targeting) at first and then testing with actual patients for final tuning. Here are some useful tactics that streamline this process.

1. Know your audience — Unlike “exergames,” which are built for the general public, most health games are designed specifically for people who are impacted by a physical or cognitive condition or disease. On some measures, their abilities will typically be less than the average person. So while it’s fine to do basic testing of early game prototypes with project staff, you must test your prototypes with the target audience. Very often, you’ll need to decrease the difficulty levels because you’ve overestimated their abilities.

2. Design difficulty progression to align with clinical goals — In the case of the Sit-Stand game, the game was tightly aligned with the clinical goals specified by Drs. Dowling and Melnick at UCSF. The more the patients played the game, stood up and sat down, the more therapy was delivered. We didn’t include game data as a factor in the clinical analysis but it may have provided a kind of proxy assessment of functional status. I’ll explore the depths of the clinician-game designer collaboration and proxy assessments in an upcoming article in this series.

3. Design/select an “easy-to-tweak” difficulty parameter — As you’re aligning with clinical goals, you should also identify or “design in” a key parameter that you can easily adjust to increase/adjust difficulty. With Sit-Stand this was really easy: just number of repeated steps and the time duration. Your game will likely have similar obvious options. Try to narrow it down to one to make your life easier when it’s time to “tune” the game.

4. Create a “designer dashboard” — The optimal approach gives the designer full control of the game difficulty with a designer interface or “dashboard” that lets him/her adjust difficulty and then review how well (or poorly) prototype testers succeed in the game. This relieves the developer to tangle with more difficult challenges such as new features and bugs (and not “tweaking” difficulty parameters.

To maintain flow, too easy is just as undesirable as too hard. By iteratively testing your games, you’ll empirically discover the sweet spot of appropriate difficulty. I’ll dive deep into the weeds on how to design and construct a designer dashboard in a future article.

Closed Loop

The other prominent way of creating appropriately difficult challenges uses an algorithm to continually adjust game difficulty based on the player’s current ability. This approach depends upon being able to assess the player’s ability accurately but if that high bar is met, then this method of creating appropriately difficult challenges can be highly effective.

Cover of Nature journal featuring Neuroracer program

Two of the leading practitioners of the closed-loop approach, Adam Gazzaley and Joaquin A. Angeura of the Neuroscape Lab at UCSF, leveraged this approach in their development of the Neuroracer game. Their game so improved the mental flexibility and cognitive control of players in their rigorous clinical study, that their work was featured on the cover of Nature journal in 2013.

Their research helped launch a leading digital therapeutics company, Akili Interactive, whose ADHD gaming therapy just showed impressive results in a major British study reported in the Lancet Digital Health Journal (2/24/20): A novel digital intervention for actively reducing severity of paediatric ADHD (STARS-ADHD): a randomised controlled trial.[1]

I recently collaborated with the Neuroscape team on a set of interactive tutorials to help the target audience, seniors in assisted living centers, learn how to play the games. Each one of the three-game set targets a different aspect of cognition that declines as we get older: working memory, task-switching, and selective attention, and all used the close-loop adaptive system to balance the game difficulty with the player’s current ability.

Screen shots of Neuroscape’s BBT mini-games — Cognitive mini-games of Neuroscape’s BBT program

In the Task Switch game (far left image), players must determine whether a center image is most like one of two comparison images. When a shape appears, the player needs to decide if it’s most like the top or bottom shape. When a colored circle is shown, the player must decide if it’s most like the circle on the right or left. The game then challenges the players to quickly “switch modes”–shape or color– and then make the correct analysis of the center image.

“People’s ability to switch from one task to another declines with age. What we’ve found in previous studies is that with cognitive training like this, people can dramatically improve this skill.” Project Director Joaquin A. Anguera.

During the game’s initial “thresholding” step, the program rigorously assesses each subject’s initial ability on each of the games. This sets the starting point for the closed-loop algorithm. The UCSF team also uses MRI, EEG, and other assessments to thoroughly document the patient’s physical and cognitive condition.

Comparison of easy and hard Task-Switch challenges — Easy ……….Task Switch Challenges ………Harder

At the easiest game level, the center circle or shape is very similar to one of the options. As the player gets better, the decision becomes more challenging because the center image becomes more like a mixture of the two options. The player’s choices, correct or incorrect, are factored into the adaptive algorithm. Based on the player’s most recent outcomes, the algorithm selects the optimal center image to present an appropriately difficult challenge.

Developer/Designer Roger Anguera created a tool that produces blends of the tomato and pepper in specific proportions. I’ve shown only 10 proportional blends below but the tool can create up to 50 different blends. Roger’s algorithm can then simply select the appropriately difficult center image based on the player’s current ability.

Graphic showing progressive blends of tomato and pepper — Custom tool creates precisely defined blends of comparison images

With an easily defined difficulty parameter in hand, the blend amount, Roger then applied a well-known entertainment game design heuristic known as “1-up, 3-down”.

“When a player gets the right answer, we increase the difficulty rank by 1, just a little harder, and keep going until they make the wrong choice. Then we decrease the difficulty by 3. We drop the difficulty by 3 to keep the success rate to around 70%.” Developer/Designer Roger Anguera.

Dropping the difficulty by 3 points after a loss also helps prevent the player from suffering several losses in a row, which would tend to drop them out of flow. After the second loss, the algorithm drops another 3 points of difficulty (down 6 in 2 attempts). The game is now noticeably easier, which helps the player get back in the groove. The adaptive system then adjusts as they perform better.

The Neuroscape team will be conducting tests of the BBT program in the coming year to see if helps seniors sustain the three cognitive abilities targeted by the game– selective attention, working memory, and task switching, and possibly even improve them.

Because the game difficulty is continually updated based on the player’s ability, Closed Loop is the most efficient method for creating appropriately difficult game challenges. And if that was the only important factor in creating effective health games, then there would be no need to use the difficulty progression or any other method for generating the goldilocks condition.

A Question of Balance

In his book, Flow, the psychology of optimal experience[2]” psychological researcher Mihaly Csikszentmihalyi articulates several key conditions that must be met for people to experience the highly-focused flow state. When comparing the Closed Loop and Difficult Progression approaches two competing conditions seem most relevant:

A balance between challenge and skills
A feeling of control over the task

Closed Loop excels at balancing the challenge with the player’s skills by continually assessing the player’s ability. However, because the program determines the challenge, not the player, it doesn’t rank as high on the “control” criteria for flow.

Graphic showing relative strengths of Closed Loop and Difficulty Progression approaches — Relative strengths of Closed Loop and Difficulty Progression Approaches

In contrast, the Difficulty Progression excels in letting the player control the game by allowing them to select which game level to attempt. This freedom, however, means that sometimes the player doesn’t get challenged enough (or too much) for them to achieve the flow state.

A Potential Hybrid Solution?

A possible hybrid strategy would leverage the best features of these two approaches to achieve optimal results. Based on an athletic training model, it would consist of two distinct components: training sessions and competitions.

Training Sessions

In an athletic context, it’s common to “over-train” to build up strength or endurance. I can still remember suffering through 14-mile cross-country practice runs, even though the actual races were only 2 ½ miles long. Swimmers endure grueling workouts of five or ten thousand meters to prepare their bodies to compete at distances much, much shorter.

You trained hard because competitions only came around every so often. You knew you would have to improve to compete–the question was whether you could improve enough! So while we grumbled on our long training slogs through the cold, damp New Jersey suburbs, we knew these efforts were necessary to face the challenge of the next track meet.

Back to health games, you don’t want to carry the ‘athletic training’ theme too far. Our players have a particular condition that needs to be considered and safety is always paramount with a medical treatment. But as I’ve personally seen with patients tackling a difficulty staircase, most people seem to like an appropriate challenge.

The high efficiency of the Closed Loop approach makes it the perfect strategy for training sessions that focus on maximal impact with less emphasis on control. And within the “training” context, players may be more willing to take on harder algorithmically-tuned challenges–as long as they have a chance to exert control at some point.

Competitions

The second part of the hybrid approach would be a difficulty progression of competition levels that gives the player more control over the challenge they want to face. Players would need to “climb” the staircase by winning each competition level in turn … no jumping steps. But they would choose when to take on a new competition level … and not the program.

When designing a difficulty staircase-only approach, the steps of the staircase should be high enough that roughly half of players succeed at first try. But in this hybrid approach, the steps should be higher to motivate players to work harder during the training sessions.

Diagram comparing difficulty staircases for hybrid approach — Comparison of step heights

The competition levels, benchmarks in a sense, could also give clinicians a valuable proxy assessment of the patient’s functional status. Such a metric, performed during each gaming session, would provide an amazingly detailed dynamic record of the patient’s experience with the gaming regimen.

Short note: I’ve shown 4- and 5-step staircases in these simplified diagrams but staircases can be easily extended with additional steps, provided you’ve built in an easily adjusted difficulty parameter.

A Sum Better Than Its Parts

The best way to combine Closed Loop training sessions and a Difficulty Staircase of competitions would be determined through careful research that explored the optimal ratio of these two approaches. Key areas of inquiry would include:

1. Optimal ratio of training sessions to competitions — Most of the player’s health improvements are likely to occur during the longer and often more strenuous training sessions. This would tend to warrant a higher ratio, more training sessions per competition. For example, players might need to complete three or four training sessions before they could attempt a competition (or reattempt one). The counterbalancing factor is the player’s need for control of some part of the gaming experience.

A series of “design experiments” would systematically adjust the training/competition ratios and measure their impacts on the players’ subjective sense of control and overall engagement with the program. The optimal ratio would balance the short-term benefits of more training sessions with the long-term goal of encouraging players to “stay with the program” and continue their sessions in the gaming regimen.

2. Optimal Competition Step Height — The steps of the difficulty progression in a hybrid program would be higher than a difficulty progression-only approach but how much higher would the optimal step height be? We want players to feel the pressure to meet the challenge of a tough “next step up”. This “mini-goal” can provide short-term motivation … but only if the next step seems ultimately winnable.

Another series of design experiments would incrementally adjust the height of a difficulty step while measuring each player’s difficulty progression, subjective sense of control, as well as the overall distribution of players across the entire staircase. We wouldn’t want all of the players stuck at the lower levels because the step height is too high; neither would we want everyone at the highest levels.

These design experiments would involve a lot of user-testing, initially with people unaffected by a particular disease, to explore how the training/competition ratio impacts outcomes. A similar experiment would investigate how step height affects outcomes as well.

This knowledge and experience would then produce a tunable gaming therapy model that would work across multiple timescales. Balancing difficulty helps produce the second-by-second experience of flow. Giving players a sense of control of their therapy increases their motivation to follow through with the gaming regimen over 8–10 weeks.

Although this gaming therapy model would need to adapt to the needs of particular conditions and diseases, most of this tuning would performed, somewhat automatically, during the tuning of the Closed Loop training sessions. Data from that tuning would inform how the heights of the competition difficulty staircase are tuned.

Wrap-Up

This is the third article in the Secrets of Creating Effective Health Games series. If you haven’t checked out the first two articles, you can click the links below. I plan to release one a week for the next few weeks so please follow me to get the latest article.

1) How Health Games Create Flow to Help People Battle Chronic Diseases

2) The Secrets of Creating Effective Health Games — Timescale Design

3) Secrets of Generating the “Goldilocks Condition” in Health Games

4) Secrets of Creating “Clinically-Inspired” Health Games

5) Secrets of Engineering Efficacious And Effective Health Games

This series is based on my experience designing and managing the production of four, in-depth clinically-inspired games that helped a diverse group of people: seniors with Parkinson’s disease, children with cerebral palsy, adults with multiple sclerosis, and kids with severe anxiety. I really enjoy applying my game design and team-building skills in the creation of games that can help people get better.

My experience creating and teaching the Designing Health Games course at the American University Game Lab (2016–2019), helped gel some ideas that I’ve shared here. I had to update the syllabus every year to keep up with this rapidly emerging field!

Any and all comments welcome!

[1] https://www.thelancet.com/journals/landig/article/PIIS2589-7500(20)30017-0/fulltext

[2] Csikszentmihalyi, Mihaly. Flow: The Psychology of Optimal Experience. New York: Harper & Row, 1990.