Foundational RL: Dynamic Programming | by Rahul Bhadani | Dec, 2022
Road to Reinforcement LearningCover photo generated by the author using an AI tool Midjourney (Licenses as Creative Commons Noncommercial 4.0 asset license)Through the previous two articles: (1) Markov States, Markov Chain, and Markov Decision Process, and (2) Solving Markov Decision Process, I set up a foundation for developing a detailed concept of reinforcement learning (RL). The RL problem is formulated as Markov Decision Process (MDP) which can be solved for optimal policies (i.e. what action needs to be taken by an…