The Startup
Published in

The Startup

Using Q-Learning for OpenAI’s CartPole-v1

(Image by Author)
Equation: Q-Learning from Wikipedia Contributors [3].
import numpy as np # used for arrays

import gym # pull the environment

import time # to get the time

import math # needed for calculations
env = gym.make("CartPole-v1")
print(env.action_space.n)
LEARNING_RATE = 0.1

DISCOUNT = 0.95
EPISODES = 60000
total = 0
total_reward = 0
prior_reward = 0

Observation = [30, 30, 50, 50]
np_array_win_size = np.array([0.25, 0.25, 0.01, 0.1])

epsilon = 1

epsilon_decay_value = 0.99995
q_table = np.random.uniform(low=0, high=1, size=(Observation + [env.action_space.n]))q_table.shape
def get_discrete_state(state):
discrete_state = state/np_array_win_size+ np.array([15,10,1,10])
return tuple(discrete_state.astype(np.int))

--

--

Get smarter at building your thing. Follow to join The Startup’s +8 million monthly readers & +760K followers.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Ali Fakhry

Reinforcement learning, artificial intelligence, and software. NYU.