Module 7: Reinforcement Learning
Topic 2: Temporal Difference Learning
Introductory RL methods
In this module, we will discuss several specific learning methods for RL. There are a lot of advanced methods out there! We could do an entire class on just RL, even without the rest of ML! But we only get a short time in this class to cover it so we are starting with the basic methods that provide the foundation for the more advanced ones. Your reading for this topic is Section 22.3 (Active RL).
Learning the state-value function
We will start with discussing the basic approach to learning the state-value function (V). This approach is called temporal difference learning or TD learning.
Link to my slides
Learning the action-value function
Learning Q can also be done in multiple ways. I discuss the two standard approaches here, SARSA learning and Q-learning. If you choose RL for your project, either of these will work well!
Link to my slides
Exercise
Complete the exercise on RL methods