Operant Conditioning and Interactive Media

Cameron Leonard
Interactive Designer's Cookbook
7 min readMay 2, 2017

Our Chef: B. F. Skinner

“Education is what survives when what has been learned has been forgotten.”

B. F. Skinner (Source:https://www.biography.com/.image/t_share/MTE5NTU2MzE2MzcxNTg0NTIz/bf-skinner-9485671-1-402.jpg)

Burrhus Frederic Skinner, born in Susquehanna, Pennslyvania in 1904, was an American psychologist. He was Harvard to the core: receiving his Ph.D. in 1931, and then serving as a Harvard researcher and professor for a significant portion of the rest of his life.

Proclaimed as “America’s most influential behavioral scientist,” (from the B. F. Skinner Foundation website) his research focused on what causes people to repeat certain actions.

He found that the main influencing factor was not free will, which he considered an illusion, but rather the consequences of those repeated actions.

Unlike Ivan Pavlov, who was able to condition certain reactions in his subjects via otherwise unrelated actions beforehand, Skinner found that behavioral control hinged heavily on whatever came after the subject acted. Though his research delved into behavioral control methods that may sound frightening, Skinner argued for teaching through positive reinforcement and in the most humane way possible.

Our Ingredient: Operant Conditioning

“It is a mistake to suppose that the whole issue is how to free man. The issue is to improve the way in which he is controlled.”

Operant conditioning, the primary focus of Skinner’s research, is conditioning behaviors based on the consequences of the behavior. The core of operant conditioning is this: “behavior which is reinforced tends to be repeated (i.e. strengthened); behavior which is not reinforced tends to die out-or be extinguished (i.e. weakened).” The most infamous experiment of Skinner’s research into conditioning would be the aptly named Skinner box.

An example of a Skinner box (Source:https://www.simplypsychology.org/skinner%20box.jpg)

In the example depicted to the left, a rat is placed into a box with a lever. Sometimes, the lever would give the rat food, which it learned by accidentally bumping into the lever. The rats would eventually go directly to the lever when placed in the box in order to get food. This is positive reinforcement. Other times, an electric current would be employed to bother the rat upon it being placed into the box. Once more the rat would accidentally bump into the lever while moving around, shutting the current off. Eventually, these rats also went directly to the lever when placed into the box. This is negative reinforcement. It should also be noted that behaviors can be deterred through punishment, for instance if the rat was to press the lever and turn the electric current on when it wasn’t before.

When using operant conditioning, Skinner and his colleagues determined different effective schedules of reinforcement:

Response rate over time for different reinforcement schedules (Source:https://www.simplypsychology.org/schedules-reinforcement.jpg)

Continuous Reinforcement: Wherein a behavior is reinforced every time it is done. Subjects grow bored of the behavior quickly.

Fixed Ratio Reinforcement: Wherein a behavior is reinforced after it is performed a specific a number of times, such as receiving a reward in a video game after a certain number of wins.

Fixed Interval Reinforcement: Wherein a behavior is reinforced after a specific amount of time, such as being paid hourly for working.

Variable Ratio Reinforcement: Wherein the behavior is reinforced after it is performed an unpredictable number of times, such as with gambling via slot machine. Highly engaging.

Variable Interval Reinforcement: Wherein a behavior is reinforced after an unpredictable amount of time has passed, such as receiving an item after a random amount of time playing a video game.

The most engaging schedules of reinforcement are those with the lowest rates of extinction; those that the subject grows bored of the least quickly. These would be variable ratio reinforcement and variable interval reinforcement, which are particularly engaging for subjects because of their unpredictability.

Examples of Operant Conditioning in Interactive Media

“As the senses grow dull, the stimulating environment becomes less clear. When reinforcing consequences no longer follow, we are bored, discouraged and depressed.”

Interactive media, especially video games, are full of operant conditioning. Done well, it can help a game to be so engaging that it might border on addictive.

An example of a log-in bonus in Nintendo’s Fire Emblem Heroes (screenshot mine)

The fixed interval schedule

A common example of operant conditioning in games would be bonuses received for playing the game at least once a day. This follows the fixed interval schedule of reinforcement, and is ordinarily something highly desirable so as to make sure that players will have the maximum amount of incentive to play the game daily. In Nintendo’s recently released mobile game Fire Emblem Heroes, orbs, the item used for summoning more heroes, are often used as a daily log-in bonus during special events and on Sundays. This is one of very few methods to acquire orbs other than paying real money for them, and these log-in bonuses are highly coveted as such. Another game that gives daily log-in bonuses would be Sony’s racing simulator Gran Turismo 6, which grants players in-game credits that can be used to purchase more cars, but it takes things a step further. When players play while logged in to their PlayStation Network accounts for multiple days in a row the bonuses increase, encouraging even more frequent play.

An example of daily quests in Hearthstone (Source:https://pandamanana.files.wordpress.com/2015/05/hearthstone-daily-quests.jpg)

Another example of operant conditioning using the fixed interval schedule would be games that offer daily tasks that can be completed for rewards. This has been employed in a variety of games, from the daily quests in Blizzard’s Hearthstone (a collectible card game based on the Warcraft universe) to the daily challenges in Codemasters’ DiRT Rally (a realistic rally racing simulator). The rewards, much like the log-in bonuses, are often something very useful, frequently in-game currency. As these tasks must be completed in order for players to receive the rewards, they arguably garner even greater engagement than log-in bonuses. I have had friends tell me that there are games wherein they simply play the dailies for the rewards and then stop, but they are still playing the game nearly every day.

An example of a very lucky summoning session in Fire Emblem Heroes (screenshot mine)

Gacha Games: Variable Ratio Reinforcement

A recent trend that I have noticed in the world of mobile games in particular is the rise of what are known as “gacha games”. The name is derived from gachapon, which is a kind of Japanese blind box toy vending machine. These games, such as the aforementioned Fire Emblem Heroes and others like Puzzle and Dragons the currently Japan-only Fate/Grand Order, generally see players using a rare in-game currency to summon more characters to use in gameplay. These games use variable ratio reinforcement in much the same way as a slot machine. Players can receive different characters of different levels of rarity, with rarer characters (such as the five star characters depicted to the left) being stronger. Since the currencies used for summoning can be purchased in bulk with real world money, and the chances of pulling rare characters a very low (6% for a five star in Fire Emblem), these games can also garner many of the problems that slot machines do and would likely be very problematic for people predisposed to gambling addictions. However, since these games do offer some of their relevant currency for free and tell players the exact percentage chance of receiving different rarities of characters, I would argue that these games are far more honest than slot machines.

Players earn experience at the end of every Overwatch game based on their skills (Source:http://cdn.gamer-network.net/2016/usgamer/Overwatch-XP-Screen-01.jpg)

Combinations of Techniques

The final example I will give incorporates many different reinforcement schedules. In Blizzard’s hero-based team shooter Overwatch players earn experience at the end of every match based on whether they won or lost, the amount of time spent in the match, and their performance in the match itself.

Upon leveling up, players receive a loot box containing all manner of in-game items. Speaking from personal experience, this can be very engaging, and I often start Overwatch thinking “I’ll play until I get a loot box.” There are some problems with this method however. The amount of experience required to level up can only go as high as 22,000, but still that sometimes feels too far away, causing playing to feel like a slog. In addition, the in-game items in the loot boxes come in different tiers of rarity; if the item one wants is of legendary rarity, the highest tier, it is unlikely that they will get it. This causes earning loot boxes to be either very gratifying if the player receives an item they want, or very disappointing if they don’t. In essence, when this method works it works very well, but sometimes it leads to disappointment and a feeling that the player’s time was wasted.

Operant conditioning is a powerful tool for use in interactive media. When used well, it creates a very engaging, rewarding experience for the player. When used poorly, it can cause the player to feel like their time has been wasted. On the darker side, misuse of operant conditioning could bring some players close to what could be considered video game addiction, which was elaborated on further by my colleague Shelby Kuster in her article. Use operant conditioning well in your game and players will find themselves wanting to play like clockwork.

Sources:

--

--