Introduction to Reinforcement Learning (Coding Q-Learning) — Part 3

Adesh Gautam
Jul 9, 2018 · 5 min read
Talk is cheap. Show me the code — Linus Torvalds

In the previous part, we saw what an MDP is and what is Q-learning. Now in this part, we’ll see how to solve a finite MDP using Q-learning and code it.

OpenAI gym


import gym
env = gym.make('FrozenLake-v0')

FrozenLake Game

State 10 with q values
Q-value updation equation

FrozenLake in action

Agent in action

This story is published in The Startup, Medium’s largest entrepreneurship publication followed by 343,876+ people.

Subscribe to receive our top stories here.

The Startup

Get smarter at building your thing. Join The Startup’s +750K followers.