Course Review : Fundamentals of Reinforcement Learning [Coursera]

Harshit Sharma
4 min readSep 20, 2019

--

Reinforcement Learning is one of the machine learning technique which is getting lot of attention recently. I really get excited when I get to learn something new and exciting and especially if something is related to machine learning.

I have been trying to start learning the fundamentals of reinforcement learning for quite some time and like every other such plan this was just logged in my Bookmark list waiting for an action. Recently University of Alberta started Reinforcement Learning specialization course on Coursera which allows one to deep dive and understand the fundamentals of reinforcement learning. Instructors Martha White and Adam White did a great job in explaining the fundamentals of reinforcement learning. For someone who is interested in this area of machine learning and waiting to get a kick start, must give this course a try.

So what is reinforcement learning? As we start thinking more and more about it , it more or less feels like our conscience which helps to take actions in our day to day life. Don’t get confused with the word conscience. What I mean is for every action that we take there might be some reward involved in it. For example when we play games on our consoles there is one clear end goal which is to complete mission and our action and movements determines how soon we finish a mission, at the same point of time there is some reward involved for every action or series of action which reinforces the importance of those actions at particular stage.

Likewise in reinforcement learning we have Environment , Action , Reward and State is involved which is very similar to Game(Environment), action or series of actions which we take while playing games (such as movements), points/gems (Rewards) and stages (State) .

Reinforcement learning allows us to improve learning algorithm by trying different types of algorithms which carefully assess the final reward due to series of action which we take in an environment in different states. How can we do it better? How do we frame problems on the same lines such that we can improve them through reinforcement learning.

As you start this course you will understand it requires one to go through reading materials. They give you a confidence and personally this course is one of my favorite, unlike all other courses this course focuses on understanding concept on your own by going through a reading material first and then follow up video which reinforces those concepts. Not sure how many courses out there follow similar course structure but I personally liked this approach.

Week1 focuses on K-Armed Bandit problem and exploration/exploitation tradeoff. Hands down this is like an entry into the world of reinforcement learning. Its very important to set the intuition right and understand why do we need reinforcement learning in the first place .

Week2 focuses on Markov Decision Process (MDP) and different type of tasks such as episodic and continuous task. If you are familiar with reinforcement learning or you further proceed in this course , you will realise MDP is one of the foundation and will never your side.

Week3 introduces Policies and Value functions , Bellman equation and optimalilty. After defining task its also important to calculate value at each state and to calculate if the value is optimal or not. If not we might need to change or try different policy.

Week4 focuses on Dynamic Programming and using same technique for policy evaluation and policy iteration. Followed by assignment to find optimal policy using dynamic programming. I really enjoyed this assignment, Coursera provides model for parking demand with a reward function that reflects its preferences and the task is to determine optimal policy.

After you are done with course you just cannot wait to start next part in the series . Super excited to give it a try and explore more in reinforcement learning.

--

--