How to create a custom OpenAI Gym environment? with codes

Creating a game environment in OpenAI-gym from scratch

Mehul Gupta
Data Science in your pocket
5 min readJul 11, 2023

--

Photo by Karthik Balakrishnan on Unsplash

In my previous posts on reinforcement learning, I have used OpenAI Gym quite extensively for training in different gaming environments. But for real-world problems, you will need a new environment and not the pre-existing OpenAI Gym environments.

My debut book “LangChain in your Pocket” is out now

So, the question is

How to create a custom environment in OpenAI Gym?

But a bigger question is,

Why should you create an environment in OpenAI Gym?

Like in some of my previous tutorials, I designed the whole environment without using the OpenAI Gym framework, and it worked quite well. So, what's the need to use some framework? The answer is easy

Standardized interface: OpenAI Gym provides a standardized interface for interacting with environments, which makes it easier to compare and reproduce results across different algorithms and research papers. So, you can train and test different environments with different algos easily if everything follows same structure

Reproducibility and sharing: By creating an environment in OpenAI Gym, you can share it with the research community, enabling others to reproduce your results and build upon your work.

Some RL libraries like stable-baselines, RLlib, tf-agents, etc can be easily integrated with OpenAI-Gym environments and basic to advance RL algorithms can be used to train the agents with ease (without coding from scratch).

So, as we are clear with why we need this, let’s understand how to do it.

MazeGame-v0

We will register a grid-based Maze game environment in OpenAI Gym with the following features

Start and End point (green and red)

Agent (Blue)

Obstacles (black)

  • The goal is to reach from start to end point avoiding obstacles. To keep things easy, the reward system is naive i.e. 1 if the endpoint is reached else 0.
  • The action space includes 4 actions: Up, Down, Right, and Left while the observation space is nothing but a grid of the size rows x columns
  • We will be adding all these features to our environment and will be rendering using pygame.

To create a custom environment, we just need to override existing function signatures in the gym with our environment’s definition. These functions that we necessarily need to override are

  • __init__(): This functions initializes your environment with default values
  • reset(): This function is for resetting the environment to default settings
  • step(): This function executes how the environment will change once the agent takes an action. Usually, the reward function is also incorporated/called within step()
  • render(): For rendering the environment. We will be using pygame for rendering but you can simply print the environment as well.

Let’s get started now

  1. Import required libraries
import gym
from gym import spaces
import numpy as np
import pygame

2. Define the game class (read comments for better understanding)

class MazeGameEnv(gym.Env):
def __init__(self, maze):
super(MazeGameEnv, self).__init__()
self.maze = np.array(maze) # Maze represented as a 2D numpy array
self.start_pos = np.where(self.maze == 'S') # Starting position
self.goal_pos = np.where(self.maze == 'G') # Goal position
self.current_pos = self.start_pos #starting position is current posiiton of agent
self.num_rows, self.num_cols = self.maze.shape

# 4 possible actions: 0=up, 1=down, 2=left, 3=right
self.action_space = spaces.Discrete(4)

# Observation space is grid of size:rows x columns
self.observation_space = spaces.Tuple((spaces.Discrete(self.num_rows), spaces.Discrete(self.num_cols)))

# Initialize Pygame
pygame.init()
self.cell_size = 125

# setting display size
self.screen = pygame.display.set_mode((self.num_cols * self.cell_size, self.num_rows * self.cell_size))

def reset(self):
self.current_pos = self.start_pos
return self.current_pos

def step(self, action):
# Move the agent based on the selected action
new_pos = np.array(self.current_pos)
if action == 0: # Up
new_pos[0] -= 1
elif action == 1: # Down
new_pos[0] += 1
elif action == 2: # Left
new_pos[1] -= 1
elif action == 3: # Right
new_pos[1] += 1

# Check if the new position is valid
if self._is_valid_position(new_pos):
self.current_pos = new_pos

# Reward function
if np.array_equal(self.current_pos, self.goal_pos):
reward = 1.0
done = True
else:
reward = 0.0
done = False

return self.current_pos, reward, done, {}

def _is_valid_position(self, pos)
row, col = pos

# If agent goes out of the grid
if row < 0 or col < 0 or row >= self.num_rows or col >= self.num_cols:
return False

# If the agent hits an obstacle
if self.maze[row, col] == '#':
return False
return True

def render(self):
# Clear the screen
self.screen.fill((255, 255, 255))

# Draw env elements one cell at a time
for row in range(self.num_rows):
for col in range(self.num_cols):
cell_left = col * self.cell_size
cell_top = row * self.cell_size

try:
print(np.array(self.current_pos)==np.array([row,col]).reshape(-1,1))
except Exception as e:
print('Initial state')

if self.maze[row, col] == '#': # Obstacle
pygame.draw.rect(self.screen, (0, 0, 0), (cell_left, cell_top, self.cell_size, self.cell_size))
elif self.maze[row, col] == 'S': # Starting position
pygame.draw.rect(self.screen, (0, 255, 0), (cell_left, cell_top, self.cell_size, self.cell_size))
elif self.maze[row, col] == 'G': # Goal position
pygame.draw.rect(self.screen, (255, 0, 0), (cell_left, cell_top, self.cell_size, self.cell_size))

if np.array_equal(np.array(self.current_pos), np.array([row, col]).reshape(-1,1)): # Agent position
pygame.draw.rect(self.screen, (0, 0, 255), (cell_left, cell_top, self.cell_size, self.cell_size))

pygame.display.update() # Update the display

The above is easy to understand where

__init__(): Initiates required variablesa and gaming environment. You need to pass a 2D array with maze configs to initialize (will demonstrate).

reset(): Reset agent position to start position

step(): Updates agent’s position according to the action taken and provide reward

_is_valid_position(): To check whether action taken by agent is valid or not

render(): Render game environment using pygame by drawing elements for each cell by using nested loops. You can simply print the maze grid as well, no necessary requirement for pygame

3. Save the above class in Python script say mazegame.py

4. In a new script, import this class and register as gym env with the name ‘MazeGame-v0’. This can be any other name as well.

import gym
from mazegameimport MazeGameEnv

# Register the environment
gym.register(
id='MazeGame-v0',
entry_point='mazegame:MazeGameEnv',
kwargs={'maze': None}
)

5. Time to load the environment

#Maze config

maze = [
['S', '', '.', '.'],
['.', '#', '.', '#'],
['.', '.', '.', '.'],
['#', '.', '#', 'G'],
]
# Test the environment
env = gym.make('MazeGame-v0',maze=maze)
obs = env.reset()
env.render()

done = False
while True:
pygame.event.get()
action = env.action_space.sample() # Random action selection
obs, reward, done, _ = env.step(action)
env.render()
print('Reward:', reward)
print('Done:', done)

pygame.time.wait(200)

It is very similar to loading any other pre-existing environment present in OpenAI Gym. Want to see the env you designed?

Do remember this will register your environment to your local system only and isn’t globally available. For global availability, you need to create a pull request to the gym repository.

That’s all for today, see you soon !!

--

--