Artificial Intelligence: Types of Environments

Aditya Kumar
The Startup
Published in
12 min readFeb 20, 2020

Let’s find out how A.I. problem spaces are defined.

Photo by Emiliano Bar on Unsplash

The narrative of the environment

Imagine for a moment that you drive for Uber in New York City.

It’s 4:00 PM and it’s the day before Thanksgiving. You’re parked by Wall Street, waiting for your next passenger to arrive. You watch the roads as they get congested — pedestrians, cars, buses, and all — as everyone rushes out to get wherever they need to be, as quickly as possible.

You yourself want to beat the traffic before it’s too late, but there’s time for this one last ride.

Soon enough, someone knocks on your car door and you verify she looks close enough to her profile photo on the app. It says she’s headed to the airport. You let her in.

“Hi!” you turn around with a smile. “Domestic terminal at Newark airport, right?”

She acknowledges the greeting, friendly enough, but she seems a little anxious.

For some reason, you have a bad feeling about this. “What time is your flight?” you ask cautiously.

“It’s in an hour…”

Every goal has constraints

We’ve all been in a situation where a deadline was quickly approaching, and we had to choose between the bad and the less bad. Or in other, more technical, words, we had to maximize our utility within given constraints.

In the scenario above, the primary goal is to get to the airport and make a worthwhile profit.

There are a few constraints, however. Let’s consider time, the most obvious one. Simply put, because the passenger’s flight is in an hour, it would be best for us to arrive before then.

How might we solve this problem? Perhaps we could drive fast to save time, using the passing lane whenever possible. But this could be dangerous and we plan on staying within the speed limits. Perhaps we could find a shorter route using a GPS. But even with Google Maps and Waze, a shorter path is not always the quickest one; hence we have degrees of uncertainty.

Photo by Victor Xok on Unsplash

So where do we draw the line between maximizing profit (say, which depends on the length and time of the journey) while also optimizing consumer happiness (so we can hang on to a five star rating)?

Defining the problem space

It may seem like we are over-complicating the problem, but we as humans face decisions like this everyday. Often the best starting point is to define the problem space and break it down into easy sub-problems.

Similarly, designing an appropriate artificial intelligence must take into consideration the parameters that can affect its journey from initial state to goal state. We call this process the task.

The task, in turn, has certain specifications and properties. Stuart Russell and Peter Norvig, in their seminal book on artificial intelligence, call this the PEAS (Performance, Environment, Actuators, Sensors) description.

Performance evaluates how well we have achieved the given goal. In the story above, we want to complete the trip, arrive in time, drive lawfully, remain safe, make profit, and achieve a good rating. All these elements measure how well we did our job driving from Wall Street to Newark.

Typing the Environment is the brunt of the rest of this article, but in essence, it categorizes the nature of external phenomenon that may impact our process to the goal state. For example, while driving to the airport, we need to be concerned with other vehicles on the road, pedestrians at crosswalks, different roads and traffic cues, etc. We’re not alone in the world, and what others do can affect us, and what we do can affect them.

Actuators are the means by which we act upon the environment. For example, a car has a horn to alert others, blinkers to inform of a turn, a steering wheel to actually turn, and pedals to control speed.

And sensors are the means by which we take in environmental cues. This might be a proximity sensor sounding when we are too close to another vehicle or entity, a lane departure sensor that keeps us from drifting, and our navigation display which lets us view general vehicle data (like oil temperature, upcoming maintenance, what radio station we’re on, etc.).

A self-driving agent. Image created by author

So why is environment the largest category? Well, when concerning the external state of the world, there’s much that can happen, and much we need to plan for.

Let’s look at the six scales upon which to view the nature of the environment:

Fully observable vs. partially observable

How observable an environment is relates to how much relevant information we can draw from it at a given time. For example, while driving in the city, we can see the cars in front of us, behind us, and on either side (minus blind spots, of course). We might also be able to gauge how fast others are driving and whether they intend to change lanes or not, using their relative speeds and blinker indications, respectively. However, we likely can’t see what’s happening two miles ahead and a few blocks over to the right.

But we probably don’t need to know this information, as it doesn’t pertain to our immediate environment. However, if we are indeed driving two miles ahead and a few blocks over to the right, we might be interested in updates about traffic jams, road closures, etc., coming our way.

If we have all the information we need to know present, then our environment is fully observable. This kind of environment is beneficial for artificial intelligence as it does not need to keep track of any extraneous variables; it simply takes in external cues for reflexive processing. On the other hand, having access to some, but not all, relevant information at any given time creates a partially observable environment. In this case, an artificial intelligence may need to keep track of history to predict future decisions.

Single agent vs. multi-agent

An agent is any autonomous entity (a human, a computer program, a robot, an animal, a self-driving car, etc.) that can perceive its environment (via sensors) and produce a response (via actuators). This response does not need to be correct or rational; that’s where performance metrics come in. But these responses are ultimately articulated at achieving an objective.

A single agent can be thought of as isolated. For example, if you are playing solitaire, there isn’t any second player to compete against, whereas if you sit down at a poker table, others are now involved in the outcome of the game.

Poker in particular is a fully competitive multi-agent environment. Each person at the table seeks to maximize his or her own utility, or “win”, at the expense of others.

What about driving? Are the roads a competitive environment as well?

To a certain degree, yes, but not fully so; there’s a contract (your driver’s license) that tells you that avoiding collisions, following traffic rules, using blinkers, and driving within the speed limits helps keep everyone on the road safe, including yourself. Breaking these rules serves to competitiveness, and may increase temporary utility (e.g. cutting in front of an exit pileup to save some time), but at the risk of “getting a ticket” or some form of punishment. The roads are therefore partially competitive yet also partially cooperative.

Now let’s think about driving in the general sense: can a self-driving agent be viewed as a single agent environment instead of multi-agent?

Yes, it’s possible. If, on a perfect road, we assume the contract that people tend to drive safe, then all other vehicles can simply be viewed as objects that we must avoid. This is like saying our car is the only one on the road, and that we need to navigate through moving obstacles to reach our destination.

In a fully cooperative multi-agent environment, this is a fair assumption to make, and we can treat it as single agent instead. But introduce competitiveness and it fails to hold.

Deterministic vs. stochastic

Something that has been determined is guaranteed to happen. The laws of physics exemplify this: all other things equal, if you were to drop an apple, it will fall towards the Earth’s center of mass. This is simply gravity at work.

In the case of an AI chess agent, we might imagine it in the initial state with no moves made yet. If our agent has the first move and decides to push a pawn forward, it is guaranteed that in the next state, that pawn is not in its starting position.

On the other hand, if the agent decided to move the pawn forward, and if sometimes it ended up one space ahead, sometimes two spaces ahead, and sometimes even exactly where it started, then we have a degree of uncertainty in what will happen. This randomness contends to a stochastic, or non-deterministic, environment.

Episodic vs. sequential

Classifying an environment as episodic or sequential is related to action histories and long-term utilities.

For example, with a chess agent, each new action depends upon what happened previously. Or, in other words, different actions can have different consequences. Using your queen to take your opponent’s knight may bring short-term utility, but it may also put your queen at risk in the next move. This is a sequential environment.

An episodic environment, on the other hand, is one where each state is independent of one another. If a policeman with a radar gun is scanning a single-lane road for fast drivers, the speed of the previous car, all other things equal, has no bearing on the speed of the next one. It is as if each car that passes in front of the policeman is a separate, atomic, state.

The difference between episodic and sequential environments is conditional probability, where the likelihood of separate events are dependent on one another, given some piece of information or not.

Static vs. dynamic

A static environment is one that does not change as an agent makes a decision. For example, in chess, your opponent cannot make a move while it is still your turn. It is as if the passage of time is irrelevant: a move made in five seconds is not necessarily better than one made in five minutes (assuming we are playing chess without a clock, of course).

A dynamic environment, on the other hand, changes while the agent deliberates. While driving, all other cars around us are moving as well; some are changing lanes, some accelerating, some decelerating, etc. Each decision that we make on the road can be affected by the sudden movement of other vehicles.

Note the similarity of a dynamic environment to a multi-agent environment. The key difference is that in a multi-agent environment, there are opponents seeking to diminish our utility, but in a dynamic environment, the conditions themselves change, not the utility. But there is certainly some overlap between each scale.

Discrete vs. continuous

Something discrete can only be separated into distinct, whole units. For example, if you flip a coin, you either get heads or tails. There’s nothing in between (e.g. you can’t land a fair coin at a 45-degree angle). On the other hand, something continuous can be measured: the temperature can be 100 degrees, or more accurately, 100.01 degrees, or even more accurately, 100.012 degrees. We can divide something continuous into infinitesimal portions.

A discrete environment, as a result, is one with a finite amount of states. In chess, there are only so many ways we can arrange the board. As a result, there are only so many actions that we can take to reach these states.

On the other hand, in a continuous environment, our random variables can take on any value within their specified (or infinite) ranges, and smoothly so (up to infinitesimal divisions). While driving, we cannot instantaneously accelerate from zero to sixty; it requires some time. The smooth movement in change of speed is the continuous nature of a driving environment. Furthermore, the continuous environment lends itself to physics-based modeling. This is how self-driving cars predict collisions with other vehicles on the road.

These six scales are a qualitative description into classifying an environment, and although there is overlap between them, they provide a good look into designing intelligent agents, the next major consideration of the AI problem space.

When doing so, it might be important to look at a seventh scale:

Known vs. unknown

The most “mysterious” of all the types, known versus unknown describes how well the designer of the intelligent agent knows the environment.

Without jumping too far into psychology, cognitive science, and philosophy, let’s consider the human infant. It is born into this world with certain reflexes, but it doesn’t know how to drive a car, how to play chess, what laws are, what human society is! It learns these rules and information through experience and via interacting with others. This is the fundamental of an unknown environment.

Similarly, with a self-driving car, we as the programmers, the engineers, the designers, cannot test it with every single road that exists in the world. The artificial intelligence we endow the car with must be self-sufficient enough to use some basic, ingrained knowledge as a foundation for future insights. These insights then lend themselves to new rules and new understandings. In particular, we may test our self-driving car in New York City, but if we have developed it for American roads, it should be able to drive anywhere in the country without problem.

Deep learning is often the mechanism we use to understand unknown environments, and is the crux of ongoing research in artificial intelligence.

But using this introductory understanding of environments, you should now be able to classify general problems into a space that tells you which elements to consider when searching for a solution. The next stage of problem-solving is designing the right agent. Happy learning!

Revisiting the narrative

Driving in a whirlwind: a partially observable, multi-agent, stochastic, sequential, dynamic, and continuous environment. Photo by Alessio Lin on Unsplash

Damn, you think. Now I’m stressed out as much as her. Why can’t people plan their trips accordingly?

But you can’t say that to her. “Okay, an hour sounds good,” you lie flatly.

Quickly, you switch to the Google Maps app to chart the best route. The path is mostly red and orange, never a good sign, but it will have to do. You pull out into the street.

The traffic moves relatively smoothly, but after a few moments, your passenger clears her throat and asks, “Pretty slow day, huh? Do you think the traffic near the airport is just as bad?”

“Well, the map looks a little orange there, so I guess so…but it’s a bit away to know for sure. You see, we’re in a partially observable environment, and though the GPS gives us a big picture, in the immediate moment our decisions are based primarily off what is happening around us. To clarify, everyone here is trying to get somewhere quickly. Some to their homes, some to airports like us, some elsewhere on the map. But everyone on this particular road has access to only the same paths in the vicinity. Therefore, minimizing time spent introduces a partial sense of competition into this multi-agent, partially observable environment.”

“Oh, okay…”

Soon, you get on the highway, and traffic begins to clear up. Your passenger should arrive on time. You pull into the left lane, and use the car’s continuous-state accelerator to hasten the journey to the goal.

The passenger notices the speedometer tick up, and mentions, “I don’t need to get to the airport that fast.”

Yes you do, you think, …but she has a point. You slow down and return to the right lane. In any case, driving the car too fast introduces the possibility the fuel pump can’t keep up. On an older car like this, you don’t want to make the gas pedal a stochastic state variable. And besides, it wouldn’t do to get a ticket; in this environment, a common sequence that follows over-speeding is getting pulled over…

You reach Newark in time, surprisingly, and reach the departure zone. As you decide which spot to drop off the passenger, another car cuts around the front to claim the first empty space.

“Wow, someone’s in a rush!” the passenger exclaims. “But you can drop me off right here; thank you so much for the ride.”

“You’re very welcome.” You park the car and unlock the doors.

She steps out with her luggage, but turns around to say farewell. “I guess you can’t deliberate too long for which parking spot you want, right? The environment is pretty dynamic as well!”

--

--

Aditya Kumar
The Startup

Aditya Kumar is a medical student at the University of Colorado with a background in computer science. He researches AI mechanisms in economics and medicine.