| 週次 |
授課內容 |
| 第1週 |
Class Introduction & Reinforcement Learning Overview |
| 第2週 |
Markov Decision Processes (MDPs) |
| 第3週 |
Dynamic Programming - Prediction & Control |
| 第4週 |
Monte Carlo Methods |
| 第5週 |
Temporal Difference Learning |
| 第6週 |
n-Step Temporal Difference Methods |
| 第7週 |
Supervised learning, Neural networks & PyTorch |
| 第8週 |
On-policy Prediction with Function Approximation |
| 第9週 |
Control with Value Function Approximation |
| 第10週 |
Policy Gradient Methods |
| 第11週 |
Actor-Critic Methods |
| 第12週 |
Evolutionary Algorithms |
| 第13週 |
Continuing tasks, Rollout Algorithms, Off-policy AC, Multi-agent |
| 第14週 |
Class Review |
| 第15週 |
Final project presentations |
| 第16週 |
Final project presentations (continued) |
自主學習 內容 |
Supplemental lecture materials & code examples (self-study) |