Social chess: stakeholders with impactful feedback loops

Jean Czerlinski Whitmore Ortega
14 min readNov 8, 2021

--

How machine learning models are embedded in a web of strategic moves

Photo by Alina Grubnyak on Unsplash

If there were a mistake in your model, would real-world usage amplify or dampen that mistake? An assessment of model stakeholders and their feedback loops can help you assess that. You need to look many steps ahead in the web of strategic moves.

When a machine learning model learns to play a game like chess, the model does not just look for the best next move. Instead, the model anticipates the other player’s best response to its next move, and how it can best respond to the response, and so on for some level of depth, limited by time or memory (if it cannot reach the end of the game). In contrast, most machine learning models have no consideration of how others will respond to their output. Why not? I argue that many machine learning models are involved in a game of social chess, where it is worth identifying potential chess players and whether they might have an impactful response. I will identify stakeholders with impactful responses for models including language translation, predictive policing, map algorithms that recommend routes, and search algorithms, among others. The responses can result in feedback loops, which may amplify or dampen deviations like mistakes. But before I delve into complex and creative cases, let me start with the simplest and most direct feedback, which is described by control theory.

Feedback dampens or amplifies deviations

Control systems, such as a thermostat, are designed to use feedback to achieve a goal. The thermostat measures the current temperature, and if the measured temperature is below the goal, the thermostat turns on a furnace in order to increase the temperature. When the measured temperature is at or above the goal, it switches off the furnace again. The thermostat is an example of “negative” feedback, meaning that a deviation in temperature from the goal will be “negated” by switching on the furnace. Negative feedback is desirable for many control systems, optimization, and error correction systems.

The other main type of feedback is called “positive feedback.” In this case, deviations are amplified. A physical example is a microphone connected to an amplifier and speaker. If the microphone is near the speaker, any little sound will quickly get amplified into a louder sound that reaches the maximum volume of the speaker. Another example is the spread of internet memes: the more popular a meme is, the more it spreads. Model builders should keep a lookout for positive feedback because it can break models by amplifying a small mistake into a large mistake that subverts the model’s goal.

In short, negative feedback dampens deviations while positive feedback amplifies them. So how does this relate to machine learning models? Let me walk though an example with relatively direct feedback.

How feedback broke language translation

Language translation models have been hugely successful, and yet by the 2010’s, a problem related to feedback was discovered. The training data for many translation models — the “ground truth” — was scraped from the internet, but the internet was increasingly dominated by sentences generated by models rather than humans. As a consequence, the models ended up training on their own translations rather than human translations [1]. Why does that matter?

Consider what happens when a model happens to generate an inaccurate translation. For example, a correct translation of English “cat” to German is “Katze,” but suppose version N of the model incorrectly outputs “Hund,” which means dog. From now on, the ground truth scraped from the web will include this mistake, so training will consider a translation correct when an input of “cat” is mapped to “Hund.” That means version N+1 and all future versions consider the mistake to be correct! The feedback loop means the mistake has become entrenched rather than being corrected, as it would have been using data from human translators who continued to translate “cat” as “Katze.”

By 2011 a corrective technique was proposed: a form of “watermark” was added to the machine-generated translations so that they could be excluded from future model training [1]. That is, the problem was addressed by preventing the output from becoming input — the loop was broken.

Many stakeholders, some of whom may create feedback

Why was the language translation feedback loop not obvious beforehand? Because there are many types of people using the model, many types of “stakeholders.” Translation models are used by vacationers in a foreign country reading a menu and by companies printing “danger” signs in multiple languages. Only one set of stakeholders caused the feedback loop, namely the major web content creators who publish language translations — the people whose websites were used as a source of ground truth. And they only caused feedback after most of them switched from human translation to machine translation.

Consider the analogy to control systems again. When the thermostat detected the temperature was below its goal threshold, it switched the furnace’s state from off to on. The web content creators similarly switched to machine translations when the models were accurate enough, and the models were cheaper and faster than human translators. The accuracy and cost are akin to the temperature that reaches a threshold, triggering switching on a furnace.

In social systems, every stakeholder is a potential controller who can change their state, their behavior. Sometimes people change for their own reasons — an exogenous change — but many times they change in response to a model — an endogenous change. Endogenous change will result in a feedback loop if there is a chain of events that ends up changing model inputs. So we can see models nestled in a network of potential social controllers, the stakeholders.

Alternatively, we can see each stakeholder as a potential chess player who may move their piece on the chess board in response to a model’s move. The advantage of the chess analogy is that instead of restricting a stakeholder to a deterministic switch response, chess emphasizes that stakeholders have a variety of options in how they respond. Chess also evokes the tools of strategic decision-making — with its risk and trade-offs — reinforcement learning, and incentives. Our deployed machine learning models are embedded in a vast network of potential social chess games.

In either analogy, the key is reviewing all of a model’s stakeholders and how they might respond to a model, to assess whether those responses may impact a model’s goal achievement.

From natural science to social science

When a model is moved from a physical application to a social application, model builders should pay particular attention to how new stakeholders react. An instructive story comes from a model that was initially used to detect earthquakes and re-purposed to detect crimes. The earthquake model used past seismograph readings to predict where the next earthquake was most likely to happen.

Then a company called Predpol decided to apply the earthquake model to what is called “predictive policing.” It’s a simple idea: predict where crime will be so police officers can be sent there. PredPol was used by local police in Kansas, Washington, South Carolina, California, Georgia, Utah, and Michigan [2]. But Predpol’s usage has a positive feedback loop because many crimes, such as drug or weapons possession, are only detected when police are present, so sending more police to a region increases detected crimes. That is positive feedback, meaning small deviations are amplified.

For example, suppose the north region has a ground truth crime rate of 11% and the south region has 10% and that equal numbers of police were sent to both regions. More crime is detected in the north, so PredPol predicts more crime in the north, so more police go to the north, which is appropriate. However, the feedback loop begins when sending more police to the north results in even more crimes detected in the north, increasing PredPol’s input crime rate above 11%. PredPol sees an increase in crime and raises its forecast, so even more police go to the north, until eventually all the police are sent to the north. A 2011 paper explained, “This is a classic ‘go with the winner’ problem where feedback causes a runaway effect on the estimated probabilities” [3]. But the model’s goal was to help allocate police according to the ground truth crime rates, so it failed. However, the police department using the model might not realize the failure because they do not know the ground truth crime rates and there is a lot of noise in the data.

The core difference between the earthquake application and crime application is in the role of stakeholders in data collection. Seismographs record all quakes, regardless of predictions. But crimes are detected more where police are sent, and the police are changing where they go based on the model’s forecasts. That would be like moving all our seismographs to the most shaky fault and ignoring all the rest.

For this feedback loop, there is a proposed fix: When creating input for PredPol, down-weight crime counts for each region to the degree that PredPol directed police to that region [3]. So if 10% more police are sent to the north than the south based on PredPol’s output, we downweight the north’s detected crime by 10% when creating inputs for PredPol. Conceptually, this is a probabilistic version of the “watermark” on machine learning translations that helps the model ignore its own outputs when creating inputs.

Unfortunately, coming up with the right downweighting is easier said than done. But it is an important issue because there are many kinds of deviations that PredPol’s positive feedback can amplify. For example, if police are prejudiced to arrest people of one ethnic background more than another, then PredPol will send more police to regions having populations with that ethnic background. That is, if the original data reflects a human prejudice, PredPol will not only reproduce the prejudice (which is bad enough) but amplify it. Conceptually, we can downweight with a “prejudice factor,” but measuring such a factor would be politically fraught.

Furthermore, the positive feedback problems are not restricted to predictive policing. Similar problems happen in “recidivism prediction, hiring algorithms, college admissions, and distribution of loans. In all of these contexts, the outcome of the prediction (e.g., who to hire) determines what feedback the algorithm receives (e.g., who performs well on the job)” [3]. Avoiding reproducing or amplifying human prejudices is one of the big problems studied by the new field of ethical artificial intelligence [4].

Deliberate change in response to a model

The PredPol model had a feedback loop created by the dynamics of data collection. But sometimes a feedback loop is created by a user deliberately changing in reaction to a model. Consider a loan application model, which determines whether a borrower is approved for a loan. The loan applicants who are denied the loan are likely to change in reaction, for example taking out a new credit card to establish more credit history in order to get approved in the future. But applicants who were approved have no incentive to make such changes because they already got their loan!

Recent papers have argued that credit models should take such changes by loan applicants into account. An approach called “performative prediction” (a generalization of strategic classification) offers techniques to anticipate the change in distribution of loan applicants, allowing the model to jump to the equilibrium, the point at which loan applicants no longer try to change [5]. This is similar to anticipating the final equilibrium of chess, where white wins or black wins. Of course, jumping to the end assumes the model can play enough moves ahead to see the end, which may be easier in loan applications than chess — or not.

Finding unhappy stakeholders in map applications

The examples of feedback I have discussed so far involved relatively obvious model stakeholders — the people transacting with the model. But sometimes, we may have to hunt around to identify all the stakeholders affected, even indirectly, by the use of a model.

Consider a map system that helps drivers find the quickest route from their origin to their destination. A map application promotes equality because it makes the routes have more equal driving times. Whenever a deviation results in route A being quicker than B, the map application sends more cars to A until the routes are back to having the same times. This is a classic negative feedback loop, maintaining equilibrium. Drivers get more consistent driving times.

But if you search enough, you can find some stakeholders who do not like the new driving equilibrium. NBC Los Angeles reported in 2014 that “Westside Los Angeles residents are working to fool Waze — a popular traffic app — into believing the side streets are clogged, so that the app stops diverting traffic into their neighborhoods” [6].

However, attempts to manipulate Waze by disgruntled residents have had little overall effect, dampened by the negative feedback loop. So the residents have asked politicians to intervene and have pleaded with Waze to change their routing algorithm [7]. Since deviations don’t propagate in the system, they have to change the system itself.

Inequality spurs change by losers

Now let me discuss a case where the model builders were very confident of avoiding manipulation spurred by feedback loops: Google Search. The founders of Google claimed at a conference in 1999, “Google’s slightly different in that we never ban anybody, and we don’t really believe in spam in the sense that there’s no mechanism for removing people from our index” [8]. Other search engines used keywords or human-curated directories. Google believed that because the PageRank algorithm ranked a page based on the number of its inbound links — other pages linking to the page being ranked — that there was no need to guard against manipulation.

However, ranking algorithms tend to produce strong positive feedback loops where the popular become more popular. For example, if the search phrase “Samsonite carry on luggage” returns the Samsonite web page first, then other web page creators are more likely to link to it, for example when creating a page reviewing different kinds of luggage. Samsonite.com gains even more inbound links, making its first rank even more entrenched. This leads Samsonsonite to also get more sales, thanks to web search users who are more likely to buy from the first-ranked page.

And that leads to the most controversial stakeholder in the search space — the sellers. Search increases inequality among sellers because sellers who appear low in search results suffer reduced sales. And although the Google founders assumed sellers would have to accept their ranking, in practice sellers found ways to increase their ranking with a collection of techniques dubbed search engine optimization (SEO), which includes a mix of “white hat” (acceptable) and “black hat” (unacceptable) techniques.

One well-publicized example of black hat techniques came in a February 2011 exposé by the New York Times, revealing suspiciously high rankings of retailer J. C. Penney on many seemingly less-relevant search terms. For example, searching for “Samsonite carry on luggage” actually returned J. C. Penney first, above the link for Samsonite.com [9]. The newspaper’s SEO expert showed that J. C. Penney achieved this ranking because “someone paid to have thousands of links placed on hundreds of sites scattered around the Web, all of which lead directly to JCPenney.com,” and such artificial inbound links are against Google’s webmaster guidelines [9].

After the tactics were publicized, Google punished J. C. Penny by artificially drastically lowering its rankings, causing the average position of JCPenney.com across 59 search terms to drop from 1.3 to 52 [10]. J. C. Penney reacted to the exposé by firing its SEO firm, which it blamed for creating the artificial inbound links to its website. After 90 days of punishment, J.C. Penney’s appearance in Google search returned to what are called “organic” rankings [11].

Over time, Google continues to improve its algorithm to better detect artificial inbound links. Google now has a team devoted to fighting web spam.

This example suggests a rule of thumb: If some stakeholders’ inequality increases as a consequence of a model’s use, the losing stakeholders are likely to attempt to increase their rank — even if the model is really awesome.

Escaping the model’s domain: beyond feedback or games

I have one more important observation to make about search engine optimization: it can be truly creative. Google has a good algorithm, and gaming that algorithm requires creative new ways to avoid detection (similar to email spam).

But creativity goes beyond the tools I mentioned here. Biological evolution has two components — variation and natural selection. Feedback systems have only one component, the equivalent of selection, which amplifies or reduces some forms. And games can only reward or punish existing moves. But creativity invents new chess pieces and new moves in the game.

That said, once the new forms are created, positive feedback loops still amplify them, and negative feedback loops still dampen them, so the study of feedback dynamics is still helpful.

Put in the language of a previous post, creativity may escape the domain of the model, necessitating model builders to add new inputs or create new models. From there on, feedback loops again determine the dynamics.

Conclusion

Machine learning researcher Ben Recht has blogged: “As soon as a machine learning system is unleashed in feedback with humans, that system is a reinforcement learning system, not a machine learning system” [12]. This post strongly supports his sentiment about viewing machine learning as embedded in a web of feedback with humans — but differs in some details. Feedback effects are not always impactful enough to be worth modeling. Model builders should review the social network of model stakeholders for potential chess games, looking for stakeholders who have the opportunity and motive to do something impactful to the model’s inputs. Furthermore, I think addressing the feedback will require a variety of tools. Reinforcement learning is a helpful approach, but so are control theory and game playing. And in practice, it is sometimes possible to break the feedback loop by downweighting data affected by the model. In this post, I have only scratched the surface of types of feedback, patterns of when feedback matters, and tools to handle feedback. But acknowledging the importance of model feedback is a good first step.

Bibliography

[1] A. Venugopal, J. Uszkoreit, D. Talbot, F. Och, and J. Ganitkevitch, “Watermarking the Outputs of Structured Prediction with an application in Statistical Machine Translation.,” in Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, Edinburgh, Scotland, UK., Jul. 2011, pp. 1363–1372. Accessed: Nov. 05, 2021. [Online]. Available: https://aclanthology.org/D11-1126

[2] C. Haskins, “Academics Confirm Major Predictive Policing Algorithm is Fundamentally Flawed,” Vice, Feb. 14, 2019. Accessed: Nov. 05, 2021. [Online]. Available: https://www.vice.com/en/article/xwbag4/academics-confirm-major-predictive-policing-algorithm-is-fundamentally-flawed

[3] D. Ensign, S. A. Friedler, S. Neville, C. Scheidegger, and S. Venkatasubramanian, “Runaway Feedback Loops in Predictive Policing,” ArXiv170609847 Cs Stat, Dec. 2017, Accessed: Apr. 21, 2021. [Online]. Available: http://arxiv.org/abs/1706.09847

[4] M. Kearns and A. Roth, The Ethical Algorithm: The Science of Socially Aware Algorithm Design. Oxford, New York: Oxford University Press, 2019.

[5] J. C. Perdomo, T. Zrnic, C. Mendler-Dünner, and M. Hardt, “Performative Prediction,” Feb. 2020, Accessed: Nov. 07, 2021. [Online]. Available: https://arxiv.org/abs/2002.06673v4

[6] M. Mekahlo and J. Giordano, “Drivers Try to Trick Popular Traffic App Waze,” NBC Los Angeles, Nov. 21, 2014. Accessed: Nov. 07, 2021. [Online]. Available: https://www.nbclosangeles.com/news/residents-plan-to-trick-a-popular-traffic-app-into-keeping-traffic-out-of-their-neighborhoods/61132/

[7] G. Stuart, “Waze Hijacked L.A. in the Name of Convenience. Can Anyone Put the Genie Back in the Bottle?,” Los Angeles Magazine, Aug. 20, 2019. https://www.lamag.com/citythinkblog/waze-los-angeles-neighborhoods/ (accessed Nov. 07, 2021).

[8] D. Sullivan, “10 Year Retrospective: Search Engine Strategies To SMX: Search Marketing Expo,” Search Engine Land, Nov. 18, 2009. Accessed: Nov. 07, 2021. [Online]. Available: https://searchengineland.com/10-years-search-engine-strategies-to-search-marketing-expo-30060

[9] D. Segal, “The Dirty Little Secrets of Search,” The New York Times, Feb. 12, 2011. Accessed: Nov. 07, 2021. [Online]. Available: https://www.nytimes.com/2011/02/13/business/13search.html

[10] M. Ziewitz, “Rethinking gaming: The ethical work of optimization in web search engines,” Soc. Stud. Sci., vol. 49, p. 030631271986560, Aug. 2019, doi: 10.1177/0306312719865607.

[11] M. McGee, “90 Days Later, J.C. Penney Regains Its Google Rankings,” Search Engine Land, May 23, 2011. Accessed: Nov. 07, 2021. [Online]. Available: https://searchengineland.com/90-days-later-google-lets-j-c-penney-out-of-timeout-78223

[12] B. Recht, “The Ethics of Reward Shaping,” arg min blog, Apr. 16, 2018. http://benjamin-recht.github.io/2018/04/16/ethical-rewards/ (accessed Oct. 31, 2021).

--

--

Jean Czerlinski Whitmore Ortega

Sometimes Google engineer modeling things and celebrating non-things: machine learning, incentives, behavior, ethics, physics.