The Strategies of the Artificial: A Narrower Description of the Conceptual Terrain (But Not Narrow Enough Yet)

Adam Elkus
Strategies of the Artificial
34 min readSep 7, 2015

In the previous entries (entry 1, entry 2, entry 3), I sketched out a broad view of various elements. I wrote this entry, like I wrote a bunch of the others, to force myself to become more narrow and specific and also just see how it all fit together. This entry is long, probably doesn’t fit too well together, and could be more efficiently summarized. It also lacks an abstract research question, though the “modeling problems” section at the end of Section II comes awfully close. I did have an image in my mind for most of today about how to do that but by the time I got a lot of other stuff down it vanished.

Finally, much of this is vague, repetitive, not really coherent, and patched-together, and unwieldy. It does not feel coherent or cohesive as much as a giant dump of both new and old content. A lot of it is from notes and other components were written on the spot. Older content was haphazardly integrated in.

I wrote this to force myself to generate better research questions than my previous experiments with BWAPI and other modeling programs have gotten me. I may not limit it to BWAPI (which is Starcraft: Brood War only) — I have thought of doing some basic multi-agent models with a cognitive modeling system and plenty of replay data from games of Starcraft: II. I also wanted to force myself to work hard to justify the particular topics and method I will be using, which is hard when there is so much ground to cover, much of it the integration of an enormous amount of separate literatures together.

I think that as the research questions are sharpened, finding a better way to summarize, organize, and direct all of this will flow out of the better research questions. Again, my hope in writing this is not that it all makes sense right now (it doesn’t, at all) but that it makes me (in the near future) try to write something much shorter that is much more focused on the specific research problems. I am kind of tired of writing justifications and long literature surveys, but if I don’t get practice making the case for why all of it goes together now I’ll be helpless when I submit a paper. So there is a secondary justification for this mostly so that when I write the “literature review” or “related work” or other components I have an enormous amount of pre-existing material I can use that has already been hashed out a bunch of times.

Half of the problem of this is simply the manner in which it was written (it’s nearly 5aM when I write this) and the fact that I wrote it continuously over a day. However, I do not think a natural organization and punchy elevator pitch for this will come until I build out the “modeling problems” section into a fullblown, “I explore these abstract research questions” type post. Practical highlights of this particular brainstorming session, I think, are simply getting out the various levels of analysis and processes. Even though this is frankly a mess, I’m putting this up here simply to incentivize myself to keep hacking on this as well as simply alert like-minded or interested people that there’s someone else working on a vaguely similar subject.

It will at least be easier to go narrower next time, and without the substantial amount of theoretical, methodological, and name-dropping throat-clearing I feel I inevitably have to do as a preface to what I really want to outline. Hence I won’t rewrite this for a while until I feel that I’m much more clear and narrower about the research questions, related work, experimental methods, and all of the other more nuts and bolts elements.

I: The “Strategies of the Artificial” As Research Program

The Strategies of the Artificial As a Research Program

Computational agent modeling today is split into several methodological camps. First, there is the basic notion of agent-based modeling as a mechanism by which a large number of simple agents produce complex social behaviors from the bottom-up. Second, there is the notion of “cognitive agents” — a few, heavily pre-programmed agents that either use complex agent architectures or a cognitive architecture such as SOAR or ACT-R.

An earlier, older tradition of computational modeling has been forgotten that bridges the complex systems tradition and the cognitive agents tradition: the notion of the “sciences of the artificial.” Throughout most of the 20th century, social, behavioral, and computational scientists collaborated to create a unique understanding of computational and biological rationality in the hope that computer engineering and studies of decision behavior could feed off of each other. The notion of decision processses as being broadly computational in nature made it easy to use computer simulations to study real-world social, rational, and cognitive processes. In turn, the more that these scientists understood about human decision behavior, the better the algorithms and programs they could engineer.

A core meeting point for these methods lay in applied policy concerns having to do with defense and security. Pioneering work in cognitive science like George A. Miller’s Plans and the Structure of Behavior were produced in a time when understanding and optimizing human decision making in a Cold War standoff was important. The notion of “mathematical programming” in defense and security suggested parallels between computational theories of organizational behavior and computer programs. It was a time when programs could function as theories of complex, ill-understood rational and social decision making behaviors.

Games and strategy games/wargames in particular (especially chess) played a key role in this research program, which was rooted around the programming of computational artifacts. The notion was that an artifact’s “inner nature” was functionally adapted to its “outer environment” and thus by programming a system to play a game of chess or convincingly represent an opponent in a RAND wargame, its functional mapping to the demands of the environment could function as a scientific existence proof.

While this method had its excesses and is rarely seen today in the social sciences outside of cognitive social modeling such as the kind performed in this edited Ron Sun volume, it had enormous value for illuminating basic mechanisms of decisionmaking in the simulation of complex and often highly murky research domains. It helped formalize intuition. And it contributed lasting ideas like “bounded rationality” and “satisficing” to the various computational sciences that deal with decisionmaking and rationality.

I argue that it is time for the strategies of the artificial, a way of using the research techniques of Simon, Chase, and others for theory development in the simulation and modeling of complex strategic behavior in simplified military simulations. This may help bridge gaping holes between social science representations of strategy and those seen in strategic studies theory about the nature and dynamics of strategy.

Computational Strategy?

As noted in the previous entries, the goal of this research lies in making a basic contribution to the problem of studying complex strategic behavior (of a vaguely military or wargame-esque bent). The problem lies in the structural disjuncture between social scientific representations of strategy and ones seen in the military and cognitive modeling; see the first entry in this series for a primer on what exactly those differences are. Here I am looking for a middle ground of sorts that can be used to do basic research experimentation — a domain that is not quite like social science but also not so detailed of a military simulation that it makes using social science methods impossible.

What kind of strategy am I talking about? What is strategy to begin with? First, I operationalize a basic strategic representation for computational modeling as a basic means of generating and utilizing combat power to achieve a desired end. This operationalizes strategic decision making, at a basic level, as the act of managing an economy for the generation of combat power as well as using combat power in battle. As the programmer and philosopher Manuel De Landa argues, the art of generating and deploying different combinations of weapon systems, organizations, and tactics creates emergent higher-level structures. The economic element of war, as historian William H. McNcNeill argues, lies in the economic elements of producing the force structures necessary to generate military power. A.E. Stahl and William F. Owen observe, strategy is “done” as tactics and as military historian Archer E. Jones has argued, tactics themselves may be represented as a series of basic weapon systems that reoccur in different combinations throughout military history. Jones also notes that different types of strategic tendencies may be related to combinations of these techno-tactical systems.

Second, I argue as per von Moltke that strategy is a system of expedients; it is a means of bounding adaptation. It is important to explain the context of von Moltke’s remark — which has been utilized in social science as well as military strategy — before continuing further. As von Moltke observed, only the beginning of a strategic interaction could be planned in detail as an operational plan. The rest was a repetoire of options that could be assembled on the fly into operational plans and were understood to be short-lived and disposable in nature. Strategy is the art of managing this “system of expedients” and may be regarded as a constant process of planning and replanning as the situation changes.

Finally, I argue as per Andrew Marshall, Albert Wohlsetter, J.C. Wylie, John Boyd, and a host of other strategic theorists both classical and modern that strategy is a means of how two opposed systems attempt to impose their will on each other; this process involves both a capacity for keeping yourself in a strong position over time as well as making decisions based on what you believe the opponent will do given the opponent perception of your expected behaviors. All of these are he basic elements that must be simulated, and I argue that in order to so we may dimensionally reduce the variegated elements of a basic theater strategy scenario to a wargame or computer game-like representation in a manner sufficient to experimentally study strategy with complex agents.

Finally, this methodological work should be specified at the interaction level of individual agents. At most basic, Clausewitz said that war is a duel. At most basic, social scientific conceptions of strategy are simple zero-sum games. The divide between both fields is simply their differing takes on the complexity of what it means to “play” the game. As Clausewitz again noted, war is simple but the simplest things in war are difficult. Before computational modelers create strategic studies work that attempts to simulate the strategies of nations, they ought to work on a granular level to specify the complex strategic interaction between relatively simple systems. The computational dimension would move social science approaches to strategy away from formal representations of strategy (game theory) and empirical ones (process-tracing or regressions) to an experimental paradigm that attempts to build theory-based models and experiments with complex strategic agents.

As noted in the previous paragraph, we may dimensionally reduce the variegated elements of a basic theater strategy scenario to a wargame or computer game-like representation in a manner sufficient to experimentally study strategy with complex agents. This assumption is already made in the computer-generated forces and wargaming communities. If the dimensional reduction allows for repeated strategic interactions, we have created an arena of military competition. At the highest level, we see strategy as a competition over time. Success stems not just from making the best decisions during each discrete strategic game instance, but in finding areas of comparative advantage over time.

However, it also should be noted that this dimensional reduction ought not to completely remove decision complexity. Strategy is a multi-scale problem and is done in real-time. Strategic behavior is functionally “done” at differing functional and hierarchal levels; typical cognitive and social scientific representations underestimate the amount of procedural knowledge needed in adversarial situations where an agent must be capable of complex behavior and tie that behavior to an expectation of what an adversary will do given that adversary’s expectation on the agent’s possible behaviors. Adversarial and competitive behaviors must be made in an environment of temporal uncertainty (no my-move, your-move abstractions, planning and replanning) and strategic uncertainty (history may not be useful, combinatorially enormous response surface).

The complexity of strategic behavior and the ways in which everything may go wrong means that “strategy is not a plan,” it is both a control mechanism for constraining behavior and a design for adaptation that constrains and guides the way that agents adapt their behavior to the environment and their opponents. While strategy may notionally be about achieving a desired end, there is a big space between the desired end and the initial conditions, Hence, strategic effectiveness prior to the full conversation of actions into strategic effect lies in how strategic entities put themselves in a good position by continuously placing themselves in a position of advantage as they move towards their abstract ultimate goal and thwart their opponents.

This challenge, though specialized for conflict simulations, is not necessarily exclusive to it in basic form. In both biological intelligence and a variety of systems, hierarchy and modularity are the key to producing complex behavior. The coupling of system elements is important in everything from neuronal dynamics models in computational neuroscience to theories of complex disasters in sociotechnical systems. Finally, the timing and sequencing of control and action in both individuals and organizations is also critical. Individual and collective strategic computation comprise two distinct “images” that share all of these variables as core elements of their behavioral organization. First, let us return back to the piece “Action Selection for Intelligent Systems” (Brom and Bryson 2006) that I cited in the prior post:

The main problem for action selection is combinatorial complexity. Since all computation takes both time and space (in memory), agents cannot possibly consider every option available to them at every instant in time. Consequently, they must be biased, and constrain their search in some way. For AI, the question of action selection is: what is the best way to constrain this search? For biology and ethology, the question is: how do various types of animals constrain their search? Do all animals use the same approaches? Why do they use the ones they do?

One fundamental question about action selection is whether it is really a problem at all for an agent, or whether it is just a description of an emergent property of an intelligent agent’s behaviour. However, the history of intelligent systems, both artificial (Bryson, 2000) and biological (Prescott, 2007) indicate that building an intelligent system requires some mechanism for action selection. This mechanism may be highly distributed (as in the case of distributed organisms such as social insect colonies or slime moulds) or it may be one or more special-purpose modules [emphasis mine].

To borrow a term of art from Kenneth Waltz, computational research in strategy has two “images” — individual heuristics and perception and distributed computation and control. Due to the nature of what computational modeling can offer as a theory-building device for strategic behavior, its initial focus ought to be formalizing and advancing knowledge about the nature of strategy as control system for adversarial situations. These two distinctive research programs relate to several ideas of how to represent strategy.

As De Landa notes, one response to the generalized problems of friction on the battlefield and the complexity of building, organizing, and controlling combined arms systems over wider and wider distances was to centralize decisionmaking. This allowed for superficially more fine-grained control at the cost of greater difficulty in making effective decisions and hamstringing local initiative. At the other extreme is the notion of a mobile army as a distributed system with very little centralization, which allows for more flexibility at the cost of unpredictability. Of course, as Bryson and Brom observed, no distributed action selection mechanism is fully centralized. Even if all modules run in parallel, resolving conflicts between modules, allocating resources, and sequencing their behavior requires a kind of hierarchy.

However, centralized control merely moved the topic of distribution, modularity, and sequencing inside the mind of the notional represntative agent that stood in for the decisionmaking system in overly centralized systems. The cognitive science theory of “threaded cognition” suggests a similar hierarchal and distributed mechanism to human multi-tasking — certain human tasks may be run in parallel as long as they do not cause a resource conflict. Likewise, the much older theory of plans and the structure of behavior suggested that a pure stimulus-response model of behavior neglected the need for intermediate layer that organized response. Behavior-based robots, for example, have arbitration mechanisms that inhibit behaviors based on priority.

So this presents us with two interesting, if ultimately divergent, ways to look how a entity of interest uses an action selection mechanism to generate and execute strategies.

On one hand, we have a focus — which occupied much of 20th century cognitive science, artificial intelligence, wargaming, and decision-theoretic models — of a focus on (1) the heuristics and perception of a representative agent or abstract central system (2) adaptive feedback and control of a representative agent or abstract central system and (3) simplified mechanisms of either rational behavior or the components that produce rational behavior in a demanding environment relevant to a representative agent or an abstract central system. Often times, as with John Boyd’s OODA Loop, the dividing lines between such levels were blurred.

On the other, we have a notion of action selection for generating strategies as a process computing by a distributed system. This can be viewed as a simple way of realizing Colin Gray’s note that strategy is a process of“dialogue and negotiation.” This kind of work has been seen in work on multi-agent systems, collective cognition, insect biology and animal behavior, organizational theory, and other related fields of inquiry. While it may be tempting to argue that one merely reduces to the other, this is only true in a superficial sense. The way that behavior is structured in a complex agent vs. a complex system differs due to the way in which modularity and hierarchy is represented.

Some may favor one system or the other, personally I believe that both “images” are necessary. By working on adversarial games with complex agents and adversarial games with complex distributed systems, we can gain insights on respective pieces of the puzzle.

Procedural Rationality and RTS Games

In picking a canvas to use for computational modeling, realism initially may not be important. Clausewitz himself patched together his theory of war by relating his qualitative insights from his own experiences and the study of military history to abstract theories and metaphors from physics and probability. Hence, given that Clausewitz’s notion of war holds that conflict has simple rules but complex issues of uncertainty inherent in how actors make choices, games will likely be key tools of theory development. As previously noted as well, a wargame may function as a dimensional reduction of strategy. Given that both realistic and recreational wargames are the closest link we have between the “games” of game theorists and military-strategic theorists, building models at the level of detail of a simplified tabletop game or computer wargame is probably the best way to start real computational modeling of strategy.

Action selection and complex behaviors and heuristics in adversarial games with complex agents can shed light on the cognitive and social elements of strategy relevant simply to how behavior and psychology interfaces with the atomic elements of strategy; after all, procedural reasoning mechanisms in organizations and individuals do have some important similarities. These similarities have never been systematically explored in the context of strategic theory, and assuming a simplified zero-sum representation can draw them out. Furthermore, if we create a population of such agents, each of which engaging in discrete strategic games with each other, we can see what kind of collective strategies arise out of evolutionary pressure from the collective interaction of complex agents with differing strategic behaviors while removing the environmental determinism often baked into the design of such population models.

Additionally, by assuming each strategic context is a discrete interaction between two complex agents, we can shed light on the more realistic (but difficult) case of how strategy is computed by a complex organization engaged in a adversarial game with another complex organization. If the organization has all of the resources and available command and control behaviors of the complex agents in the first scenario, strategy is a method of allocating access to resources and behavioral roles and synchronizing them in time. As with the simpler case, we may simulate how these organizations converge to behavioral organization and configurations over time and within a population of other organizations with their own configurations.

This would combine a standard research strategy in the computational social sciences — — simulating the bottom-up emergence of population trends in a decentralized network of agents — with simplified strategic formalisms (the complexities of which would be parsed out to behavioral modules in a single agent or a collection of agents collaborating together in an organization) and both complex agents or complex systems of agents as the micro units that produce macroscopic results.

Though I have mentioned in a previous entry that one “image” strategic representation is the notion of action selection as the product of a distributed system, I have decided at least for my own work to simplify what is already a rather thorny intellectual problem by using the single-agent/system method and adversarial game image. Here, I lay out the properties of strategic behavior (assuming a simplified military simulation) and the complexities and demands of strategy and the environment it takes place within.

Given that both realistic and recreational wargames are the closest link we have between the “games” of game theorists and military-strategic theorists, building models at the level of detail of a simplified tabletop game or computer wargame is probably the best way to start real computational modeling of strategy. There is a large historical literature that shows how conflict and strategy games — both recreational and professional — have driven key work in wargaming, computer-generated forces, cognitive systems and cognitive modeling, artificial intelligence, many branches of social science (game theory most obviously), and even many non-social disciplines in the so-called “natural” and “hard” sciences and engineering fields. Lacking an umbrella term for all of this, I will dub this giant class of literature “adversarial reasoning.”

The historian R.J. Leonard has shown how simplified recreational conflict games gave rise to modern social science. Interdisciplinary work in artificial intelligence, game theory, and cognitive science was done as a part of Cold War research and wargaming (including AI planning and control for strategic tasks). The notion of using symbolic operations to understand and master defined scenarios pervades the history of conflict and the computational sciences. Games and microworlds have been utilized for a variety of work in the “cyborgsocial and engineering sciences, and also as a part of the military’s focus on areas such as wargaming and command and control.

The patron saint of computational social science modeling, Herbert Simon, derived his famous concept of “bounded rationality” from economic and organizational behavior and reasoning:

The alternative approach employed in these papers is based on what I shall call the principle of bounded rationality: The capacity of the human mind for formulating and solving complex problems is very small compared with the size of the problems whose solution is required for objectively rational behavior in the real world — or even for a reasonable approximation to such objective rationality.

Simon developed the concept of “procedural rationality” through his research on simulating human decision-making in defense and combat situations. Simon moved away from the notion of bounded rationality as a deviation from rationality and began to focus on “procedural rationality” — which introduced the notions of computational decision-making simulations and satisficing as an optimizing process.

The fact that these two elements — computation and satisficing — appear in Simon’s work in 1955 is not casual. In 1952, he became a consultant to RAND Corporation, initially involved in simulations of an air‐defense early warning sta‐ tion, and then, from 1955 on, connected with the Computer Science Department. RAND was the paradigmatic military think tank in the post‐Second World War period. It was also the world’s largest computational structure for scientific ends at the time. Simon’s entrance in RAND marks an intellectual inflection of his. Among the aspects of this change that interests us here is his distancing away from economics toward the areas of psychology and computer science, a move that would only, and partly, be reverted in the 1970s — more specifically he placed himself in the nascent disciplines of cognitive psychology, cognitive science, artificial intelligence, operations research, and computer science, all of them tightly connected with the computer.

His research program became essentially aimed at discovering the symbolic processes that people use in thinking, and was based on the exploration of an analogy between the computer and the human mind. The main method used was the combination of the tape‐recording of the problem solving activity of subjects in the laboratory — producing “thinking‐aloud protocols” — and of the simulation of computer programs that tried to emulate the activity registered in the laboratory. This meant that programs were taken to be theories: the program capable of simulating the human behavior recorded in the laboratory is, in itself, an explanation to that behavior. The attempt at programming (theorizing) the solution processes of relatively complex problems in computers with very limited memory and processing capacity led to the satisficing hypothesis, maximization would be impracticable without drastic simplification of the model. In other words, if, on the one hand, the mind‐computer analogy suggests a very concrete image of what are the agents’ cognitive limits, on the other hand, programming always demands specification: what information the agent possesses, what criteria and procedures he or she uses to make decisions. Without such specifications, the programming cannot even begin.

While satisficing and the notion of computational models of decision-making are lumped in with Simon’s original work on bounded rationality, they really stem from procedural rationality. Satisficing, in turn, became linked to efficient search and knowledge representation, with Simon comparing everything from economic behavior to chess under the standpoint of satisficing and search heuristics. Examine the following passage in his Nobel Prize lecture:

Information processing theories envisage problem solving as involving very selective search through problem spaces that are often immense. Selectivity, based on rules of thumb or “heuristics”, tends to guide the search into promising regions, so that solutions will generally be found after search of only a tiny part of the total space. Satisficing criteria terminate search when satisfactory problem solutions have been found. Thus, these theories of problem solving clearly fit within the framework of bounded rationality that I have been expounding here.

By now the empirical evidence for this general picture of the problem solving process is extensive. Most of the evidence pertains to relatively simple, puzzle-like situations of the sort that can be brought into the psychological laboratory for controlled study, but a great deal has been learned, also, about professional-level human tasks like making medical diagnoses, investing in portfolios of stocks and bonds, and playing chess. In tasks of these kinds, the general search mechanisms operate in a rich context of information stored in human long-term memory, but the general organization of the process is substantially the same as for the simpler, more specific tasks.

In 1989, Simon, reflecting on his time both simulating human problem-solving and writing chess programs, argued that the conflict strategy game of chess could shed light on the destinies of nations:

Procedural rationality is concerned with procedures for finding good actions, taking into account not only the goal and objective situation, but also the knowledge and the computational capabilities and limits of the decision maker. The only non-trivial theory of chess is a theory of procedural rationality in choosing moves. The study of procedural or computational rationality is relatively new, having been cultivated extensively only since the advent of the computer (but with precedents, e.g., numerical analysis). It is central to such disciplines as artificial intelligence and operations research. Difficulty in chess, then, is computational difficulty. Playing a good game of chess consists in using the limited computational power (human or machine) that is available to do as well as possible. This might mean investing a great deal of computation in examining a few variations, or investing a little computation in each of a large number of variations. Neither strategy can come close to exhausting the whole game tree — to achieving substantive rationality.

. ….We have seen that the theory of games that emerges from this research is quite remote in both its concerns and its findings from von Neumann Morgenstern theory. To arrive at actual strategies for the play of games as complex as chess, the game must be considered in extensive form, and its characteristic function is of no interest. The task is not to characterize optimality or substantive rationality, but to define strategies for finding good moves — procedural rationality …..What is emerging, therefore, from research on games like chess, is a computational theory of games: a theory of what it is reasonable to do when it is impossible to determine what is best — a theory of bounded rationality. The lessons taught by this research may be of considerable value for understanding and dealing with situations in real life that are even more complex than the situations we encounter in chess — in dealing, say, with large organizations, with the economy, or with relations among nations.

If Simon were alive today, I strongly suspect that he would be using real-time strategy (RTS) games instead of chess for his own simulations of procedural rationality.

Hence, given that I seek to simulate strategy at a basic level (which entails a focus on conflict games, decision-making processes, and a simplified representation of core defense problems), I utilize the real-time strategy (RTS) game as my formalism of interest. RTS games are currently utilized in a manner chess, kriegspiel, and other complex strategic games used to be — as a testbed for artificial intelligence/cognitive systems research and comptition, a testbed for the study of expertise and complex decision making, a inspiration for wargaming and simulation, a metaphor for complex problems in domains outside the game and policy concerns that at first glance seem like they don’t have much to do with the game, and are played by both professional athletes and amateurs.

The dynamics of RTS games may be described as follows:

Generally, each match in an RTS game involves two players starting with a few units and/or structures in different locations on a two-dimensional terrain (map). Nearby resources can be gathered in order to produce additional units and structures and purchase upgrades, thus gaining access to more advanced in-game technology (units, structures, and upgrades). Additional resources and strategically important points are spread around the map, forcing players to spread out their units and buildings in order to attack or defend these positions. Visibility is usually limited to a small area around player-owned units, limiting information and forcing players to conduct reconnaissance in order to respond effectively to their opponents. In most RTS games, a match ends when one player (or team) destroys all buildings belonging to the opponent player (or team), although often a player will forfeit earlier when they see they cannot win.

RTS games have a variety of military units, used by the players to wage war, as well as units and structures to aid in resource collection, unit production, and upgrades. During a match, players must balance the development of their economy, infrastructure, and upgrades with the production of military units, so they have enough units to successfully attack and defend in the present and enough resources and upgrades to succeed later. They must also decide which units and structures to produce and which technologies to advance throughout the game in order to have access to the right com- position of units at the right times. This long-term high-level planning and decision-making, often called “macromanagement”, is referred to in this paper as strategic decision- making. In addition to strategic decision-making, players must carefully control their units in order to maximise their effectiveness on the battlefield. Groups of units can be manoeuvred into advantageous positions on the map to surround or escape the enemy, and individual units can be controlled to attack a weak enemy unit or avoid an incoming attack. This short term control and decision-making with indi- vidual units,often called “micromanagement”, and medium- term planning with groups of units, often called “tactics”, is referred to collectively in this paper as tactical decision- making.

Additionally:

Each player must balance building up his or her economic production with developing fighters, defenses, and eventually more bases. A player must also try to discern the economic and military strategy of the opponent, whose base is initially hidden from view. As mineral and gas production ramps up, the gamer gains capital to spend on developing more advanced technology, a larger economy, or more fighters. StarCraft 2’s overarching strategic challenge is to decide how much time and money to devote to building up either an economy or an army.

But that’s just one level of play. When you attack, you do not simply dispatch fighters — you manually control them by using your mouse to click on them and set them in motion. A skilled player will monitor the health of individual fighters in the frontlines of a battle and pull them back to the rear as they lose strength, giving them time to recover while fresher troops bear the brunt of the attack. At any given time, you can have several dozen fighters with different abilities pottering around a map, numerous laborers busily mining minerals and gas, and various facilities producing new tools and resources that should be deployed immediately. With so many moving parts, even a top-level player can succumb to paralysis. Translating those goals into cognitive load, the brain’s executive functions manage most of the game’s demands. Several types of memory may be engaged to keep track of the weapons at one’s disposal and the locations of multiple objects on a map; attentional systems allow a player to plan future moves, switch focus to different activities around the map, and evaluate the enemy’s strategy. Motor skills are needed to rapidly click around the map to move and implement actions.

In short, the game is a relentless exercise in multitasking and constant decision-making. The winner, often, is the person who can make the most moves — an elite player can perform about 5 or 6 actions a second, which translates into a flurry of key presses and mouse maneuvers (see video, above).

While the game is not a literally realistic representation of strategy (it is, after all, a simplified military simulation), it may help by producing a simplified representation of how agents deal with a more modern conception of “procedural reasoning.” Though computational modelers often swear by Simon’s insights, they have forgotten his pioneering interest in the complexities of human decisionmaking at a granular scale, the core role of defense applications in stimulating his insights, and above all else his usage of games as a theory development tool. All of these may be combined in a social, multi-agent, computational model. For example, the real-time strategy nuclear simulation known as DEFCON is a rather simple game when compared to the complexities of real-world nuclear strategy. Yet in compressing that complexity into a semi-realistic representation (complete with naval, air, and missile forces) that we know people play in the real world does us several things. Even highly fantastical representations of strategy have been utilized in a military context as sandboxes for data farming; run the simulation a bunch of times and generate large amounts of data that can be sifted through.

And, most importantly, this research method grounds the work in a defined formal game structure, allowing for a way to formalize intuition about complex strategic behaviors through code while also retaining somewhat of a connection to social science representations of strategy through the shared notion of a “game.” However, even very recreational approximations of strategy are not so completely disconnected from the psychological dynamics described in the last section — notions of schema, affordance, and chunking have been used to explain the playing of commercial strategy games.

Is this method a panacea or even a partially sufficient solution? No. But strategic studies is a rather demanding discipline that is inherently suspicious of parsimony and abstraction. In order to deal with those expectations as well as social science’s roots in generalizable causal mechanisms and computational modeling’s elements of mathematics, algorithms, and simulation systems, wargames and other microworlds that bridge such gaps are probably the best place to bring the “computational” to the “strategic” in strategic studies. And by focusing on the issue of action selection, researchers can develop formalisms, theories, and approaches that may be utilized elsewhere. Every journey, after all, begins with small steps.

II: Complex Agents and Complex Games

Games and Microworlds as Research Tool

Modeling strategy involves complex systems on multiple levels. Complex systems generate unpredictable behavior through hierarchal complexity — a system has different modules with hierarchal relations but long-run dependence on each other’s actions. This may be represented both as a system within an individual and within a population. As noted before, there are hierarchal, modular, and emergent aspects to strategy. But this is also a propety of agent-based modeling. In particular, agent-based models of social systems defy mean-field dynamics — more agents and more heterogeneous agents renders a system of interest unpredictable.

If we were to think about computational agents, we can start low and move higher and higher up. Agents have several basic problems at the cognitive and rational bands: regulating actions and regulating the action selection process. First, agents must be capable of simply answering the “what will I do next?” question. This is a matter of biasing search in a given direction. Second, agents must find ways of controlling their own behavior in a noisy environment; without this sense of self-control they cannot resolve resource tradeoffs between goals and behaviors.

The bridge between the prior bands and the social band lies in coping with novelty. Agents must cope with changes in both the environments they are situated within and the behavior agents they interact with. As noted earlier, complex games have chaotic dynamics and regimes. Also noted earlier, learning and adaptation in games is difficult due to noisy signals from the environment/other agents as well as the problem of anticipating the behaviors of multiple relevant decisionmakers.

In many activities of social life representable as complicated games, both action selection and learning is complex. Strategy, as Clausewitz noted, may be understood as a kind of game played with simple rules and complex probabilistic calculations. To simulate intelligent behavior in such domains, a controller biases search and control in a certain manner, utilizing tools such as hierarchy and modularity to structure intelligent behavior and adaptation and learning to improve performance. These types of controllers are for complex agents. A complex agent is an agent that defies the assumption of environmental determinism and faces limitations on both thinking (how much actions it may consider at every time step T given limitations on) and doing (how much actions it may execute at every time step T given resource conflicts between actions).

Complex agents’ computational rationality helps them load-balance demands on cognitive computations, and some form of multi-tasking figures into human cognition. Another way we can model the process of how agents deal with these challenges is by assuming that agents can be represented in terms of capabilities that may change over time and affordances that define relations between the agent and environmental objects. Complex agents face environments that pose multi-task control problems. Agent behavior is not solely goal-driven; agents move through complex state and action spaces which demand both attending to goals and dealing with immediate state-based needs. The simulation theory of mind is popular as a folk theory of mind (“I am going to do X, I am going to do Y”) but may not be accurate in describing the behavior of both self and other. Dual-process theories hold that functioning in the world may be aided by combinations of parallel processes. When behaviors and situations are interdependent, goal arbitration is necessary to ensure that agents do not oscillate from goal to goal uncontrollably.

Hierarchal organization of behavior is a key element in intelligence; hierarchal reinforcement learning has been observed as an abstraction mechanism in humans and animals, and high-level abstractions have been observed in the brain as a method of dealing with hierarchal demands. Modularity is also key; we see modularity in theories of evolutionary psychology as well as biological learning and action selection. Key issues involved in both, however, are the coupling between levels as well as the behavioral timing and organization of actions at various levels. How does this all relate to games?

Games with many possible moves condition many possible payoffs based on those moves; complex agents such as humans are cognitive maximizers that learn opponent strategies over time; agents alter their behavior in response to changes in the environment or other agents and while reward is a powerful mechanism for learning agents receive multiple signals from the environment. Complex games necessitate both plan selection and plan monitoring; agents have to select near-term goals, quickly plan them, and execute the plans in the game environment as well as utilize metacognitive reasoning to evaluate their own plan selection and reasoning processes in time.

When games feature simultaneous actions, concurrent actions, and durative actions some degree of abstraction and hierarchy is needed to deal with high branching factors. Games with multi-tasking behaviors also pose challenges in terms of memory and focus — when devoted to a single task the brain excels. When it handles multiple goals at once focus blurs and errors are unavoidable.

Toward Computational Modeling of Strategy

The atomic elements of strategy are the generation and employment of combat power to achieve a strategic goal and we can dimensionally reduce this to a tabletop wargame or computer game-like theater strategy representation. Difficulties of strategy stem from the multi-scale propety of strategic behavior and the prevalence of temporal and strategic uncertainty when dealing with a opponent. While the initial aspects of strategy may be represented as an operational plan, how strategy plays out over time in each strategic interaction/game instance cannot be fully planned in advance. Moreover, if the goal is highly abstract, a near term driver of strategy will be putting oneself in a good position to achieve the goal over time. Finally, if a strategic competition ensues between one or more strategic entities, each entity needs to develop a way of competing over time by investing in certain areas of comparative advantage and not others.

Strategy can be atomically represented, for the sake of computational social models, as the problem of a hierarchal controller making tradeoffs given a complex goal (defeat the enemy) that depends on both its own capabilities and the actions afforded by the environment as well as the expected behavior of the opponent. These tradeoffs may be observed on multiple levels. Utilizing resources to create a certain type of combat force is resources not spent on another type of combat force, and if we have two bases we cannot defend either equally. Pursuing one type of plan means that another is not pursued. However, this conception of resource tradeoffs may be broadened. Resource tradeoffs exist in how many goals and behaviors an agent can pursue simultaneously and how many actions the agent can consider at once. At a low level, agent modules may conflict for processing time and resource focus.

On one level, strategy is a control mechanism. It structures how an agent ought to control its own reasoning processes and behavior given the expected behavior of an opponent, turning a general notion such as “concentrate forces in time rather than space to exploit enemy numerical weaknesses” into a task-ordering to be executed tactically on multiple fronts. But strategy may also be viewed as simply a way of controlling adaptation. Strategy, as Helmuth von Moltke argued, is a “system of expedients.” Or, as Ian Malcolm said in Jurassic Park, “life finds a way.” Shit happens. A brilliant plan gets FUBAR’d the moment it comes into contact with the enemy. Without the ability to adapt, learn, and change, strategy is not useful.

Thus, strategy has a dual face. It’s first face is as a complex controller that biases the search for good actions. As military theorists observe, it is a process of gaining some degree of control over the opponent by using engagements for the purpose of war; as social scientists observe it is also a process of making choices given the expected behavior of another agent, as cognitive scientists observe it is a way of being (boundedly) computational rational and making complex behavior arbitrations and decisions based on the environment’s demands and a crude simulation of someone else’s crude simulation of you. However, its second face is as a way of controlling and biasing adaptation; military theorists note that adaptation and strategic learning are critical in conflict, business and organizational theorists have noted alternatively that strategy is planned emergence or simply just a hunch about how to achieve advantage, and competitive strategies denote how a strategic competition plays out over time and what kinds of comparative advantage can be wrought out of “investments” agents should make in iteratively struggling for dominance with each other.

As I have noted, we can decompose strategy into both a controller and adaptation mechanisms. The controller selects actions and constrains what actions are considered. An agent’s controller amounts to how its behavior is regulated, planned and executed. This defines a linkage between cognitive and rational bands of computational modeling — agents have control mechanisms for making decisions and higher-level mechanisms for self-regulation and load balancing that act as a meta-level mechanism for (bounded) rationality. However, both of these are structured so that they are adaptive; agent rationality is defined around “good enough” behavior given changes in the environment and other agents’ behaviors.

During a strategic game instance and across instances, an agent will attempt to adapt to what it believes the opponent(s) will do. If there is one opponent, the adaptation process is a question of “what is the best way of adapting given the opponent’s expected behavior, both in each interaction episode and across my history of interaction with the opponent?” If there is a population of opponents, the question becomes, “what is the best way to adapt that will help me do good enough to succeed in the population overall?”

However, no agent is a blank space when it comes to how to structure the adaptation process — it always begins with some kind of pre-existing bias or frame that constrains its adaptation. This bias may be seen on multiple levels. First, when the agent begins each discrete strategic interaction it may be better disposed to considering a certain type of adaptation mechanism, second, when it adapts in between strategic interaction instances it may be disposed towards a certain adaptation and learning mechanism. I argue that similar challenges occur across different temporal scales of strategy: adapting action selection mechanisms during both discrete episodes and over a sustained period of time.

One kind of problem is “how should I adapt right now during this interaction episode?” The other sort of problem lies in adaptation over multiple interaction episodes, with the agent asking “how should I adapt over time as I interact over a large population?” As we see with recent developments in neuroscience, hierarchal domains may necessitate layered learning, and agents face a challenge in altering their decisions based on a constantly shifting action-outcome relationship. As we see with the Experience Weighted Attraction algorithm, variable-ratio reinforcement, and the El Farol Bar Problem, how time and experience is utilized in adaptation is a topic of huge importance as well.

Modeling Problems

Computational social modeling of strategy is a way of modeling how complex agents in adversarial situations adapt to changes in the behavior of other agents and changes in the environment. When it comes to changes in the behavior of other agents, this change can be viewed at several levels:

  1. Changes in the behavior of other agents during a single strategic game instance/episode.
  2. Changes in the behavior of other agents between single strategic game instance/episode.

When it comes to changes in the environment, this change may also be viewed at several levels:

  1. Changes in the physical environment; differing physical environments will have differing implications for the success and failure of agent behavior due to the impact of terrain on tactics and strategy. This is the case even if all physical environments can be assumed to share the same basic affordances and the rules of interaction do not change.
  2. Single strategic game instance/episode; if we assume a population of agents under normal agent-based modeling assumptions, collective biases towards ecological fitness may emerge from decentralized interaction. Hence the environment is constantly changing due to the dynamical and stochastic nature of the agent system.

As such, I define several basic scenarios:

  1. Repeated adversarial games where there is one opponent and one environment, however agents must still nonetheless use complex signals to adapt both during each strategic interaction episode and between episodes. This may be regarded as a much more complex version of a typical 2-agent zero-sum game with detailed agents and a detailed and rich environment. Agents will have to cope with the fact that opponent behavior will change both during games and in between them. However, the physical environment will not change. This may be regarded as the simplest case.
  2. Repeated adversarial games where there is one opponent but multiple physical environments; agents must still nonetheless use complex signals to adapt both during each strategic interaction episode and between episodes. Because the physical environment in which the 2–agent zero-sum game with detailed agents and a detailed and rich environment takes place is important to the success or failure of certain strategies (differential geographic features), agents have to cope with the fact that both the environment and the opponent’s behavior will change between instances on top of the fact that the opponent’s behavior will change during game instances.
  3. Population adversarial games where there is one physical environment but there are multiple opponents; agents must still nonetheless use complex signals to adapt both during each strategic interaction episode and between episodes. This may be regarded as a much more complex version of a typical n-agent zero-sum game with detailed agents and a detailed and rich environment; agent behaviors change during games and in between games, and there may be a different opponent per game instance even if the environment stays the same.
  4. Population adversarial games where the physical environment changes and there are multiple opponents; agents must still nonetheless use complex signals to adapt both during each strategic interaction episode and between episodes This may be regarded as a much more complex version of a typical n-agent zero-sum game with detailed agents and a detailed and rich environment; each agent has to cope with two levels of selection. It must deal with within-game challenges and between-game challenges with a n-size group of heterogeneous agents. Because the physical environment in which the 2–agent zero-sum game with detailed agents and a detailed and rich environment takes place is important to the success or failure of certain strategies (differential geographic features), the fact that the environment is constantly shifting complicates adaptation.

Much of this will be done with the Brood War API simulation system, which allows the injection of agents directly into the running game of Starcraft: Brood War.

--

--

Adam Elkus
Strategies of the Artificial

PhD student in Computational Social Science. Fellow at New America Foundation (all content my own). Strategy, simulation, agents. Aspiring cyborg scientist.