Module 7: Reinforcement Learning

Topic 2: Temporal Difference Learning

Introductory RL methods

In this module, we will discuss several specific learning methods for RL. There are a lot of advanced methods out there! We could do an entire class on just RL, even without the rest of ML! But we only get a short time in this class to cover it so we are starting with the basic methods that provide the foundation for the more advanced ones. Your reading for this topic is Section 22.3 (Active RL).

Learning the state-value function

We will start with discussing the basic approach to learning the state-value function (V). This approach is called temporal difference learning or TD learning.

Link to my slides

Learning the action-value function

Learning Q can also be done in multiple ways. I discuss the two standard approaches here, SARSA learning and Q-learning. If you choose RL for your project, either of these will work well!