Luiza SantosMulti-Armed Bandits: Action-value Methods and the 10-armed testbed (DRL series part 6)This article refers to sections 2.2–2.6 of chapter 2 about Multi-Armed Bandits from the book Reinforcement Learning by Richard Sutton…Jun 17
Ugur YildiriminTowards Data ScienceAn Overview of Contextual BanditsA dynamic approach to treatment personalizationFeb 21
Massimiliano CostacurtainTowards Data ScienceDynamic Pricing with Multi-Armed Bandit: Learning by DoingApplying Reinforcement Learning strategies to real-world use cases, especially in dynamic pricing, can reveal many surprisesAug 16, 20236Aug 16, 20236
Luiza SantosMulti-Armed Bandits: Exploitation versus Exploration (DRL series part 5)This article refers to the end of section 2.1 of chapter 2 about Multi-Armed Bandits from the book Reinforcement Learning by Richard Sutton…Jun 15Jun 15
Hennie de HarderinTowards Data ScienceSolving Multi-Armed Bandit ProblemsA powerful and easy way to apply reinforcement learning.Nov 4, 20224Nov 4, 20224
Luiza SantosMulti-Armed Bandits: Action-value Methods and the 10-armed testbed (DRL series part 6)This article refers to sections 2.2–2.6 of chapter 2 about Multi-Armed Bandits from the book Reinforcement Learning by Richard Sutton…Jun 17
Ugur YildiriminTowards Data ScienceAn Overview of Contextual BanditsA dynamic approach to treatment personalizationFeb 21
Massimiliano CostacurtainTowards Data ScienceDynamic Pricing with Multi-Armed Bandit: Learning by DoingApplying Reinforcement Learning strategies to real-world use cases, especially in dynamic pricing, can reveal many surprisesAug 16, 20236
Luiza SantosMulti-Armed Bandits: Exploitation versus Exploration (DRL series part 5)This article refers to the end of section 2.1 of chapter 2 about Multi-Armed Bandits from the book Reinforcement Learning by Richard Sutton…Jun 15
Hennie de HarderinTowards Data ScienceSolving Multi-Armed Bandit ProblemsA powerful and easy way to apply reinforcement learning.Nov 4, 20224
Luiza SantosMulti-Armed Bandits: an Overview (DRL series part 4)This article refers to the beginning of section 2.1 of chapter 2 from the book Reinforcement Learning by Richard Sutton (pages 25–26)Jun 14
Yuki MinaiExploring Multi-Armed Bandit Problem: Epsilon-Greedy, Epsilon-Decreasing, UCB, and Thompson…To tackle the multi-armed bandit problem, we will learn well-established algorithms such as Greedy algorithm, UCB, and Thompson SamplingNov 20, 2023