|
6 | 6 | 1. **Fundamentals.**
|
7 | 7 |
|
8 | 8 |
|
9 |
| - * Reinforcement learning |
| 9 | + * Reinforcement Learning |
10 | 10 | [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_1.pdf)]
|
11 | 11 | [[lecture note](https://github.com/wangshusen/DeepLearning/blob/master/LectureNotes/DRL/DRL.pdf)]
|
12 | 12 | [[Video (in Chinese)](https://youtu.be/vmkRMvhCW5c)].
|
13 | 13 |
|
14 |
| - * Value-based learning |
| 14 | + * Value-Based Learning |
15 | 15 | [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_2.pdf)]
|
16 | 16 | [[Video (in Chinese)](https://youtu.be/jflq6vNcZyA)].
|
17 | 17 |
|
18 |
| - * Policy-based learning |
| 18 | + * Policy-Based Learning |
19 | 19 | [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_3.pdf)]
|
20 | 20 | [[Video (in Chinese)](https://youtu.be/qI0vyfR2_Rc)].
|
21 | 21 |
|
22 |
| - * Actor-critic methods |
| 22 | + * Actor-Critic Methods |
23 | 23 | [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/1_Basics_4.pdf)]
|
24 | 24 | [[Video (in Chinese)](https://youtu.be/xjd7Jq9wPQY)].
|
25 | 25 |
|
|
37 | 37 | * Double DQN.
|
38 | 38 |
|
39 | 39 | * Dueling DQN.
|
| 40 | + |
| 41 | + * Multi-Step Return. |
40 | 42 |
|
41 | 43 |
|
42 | 44 |
|
|
48 | 50 | * Advantage Actor-Critic (A2C).
|
49 | 51 |
|
50 | 52 | * Trust-Region Policy Optimization (TRPO).
|
| 53 | + |
| 54 | + * Policy Network + RNNs. |
51 | 55 |
|
52 | 56 |
|
53 | 57 | 4. **Multi-Agent Reinforcement Learning.**
|
54 | 58 |
|
55 |
| - * Basics and challenges |
| 59 | + * Basics and Challenges |
56 | 60 | [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/4_MARL_1.pdf)]
|
57 | 61 | [[Video (in Chinese)](https://youtu.be/KN-XMQFTD0o)].
|
58 | 62 |
|
59 | 63 | * Centralized VS Decentralized
|
60 | 64 | [[slides](https://github.com/wangshusen/DRL/blob/master/Slides/4_MARL_2.pdf)]
|
61 | 65 | [[Video (in Chinese)](https://youtu.be/0HV1hsjd1y8)].
|
62 | 66 |
|
| 67 | + |
| 68 | + |
| 69 | +5. **Imitation Learning.** |
| 70 | + |
| 71 | + |
| 72 | + * Inverse Reinforcement Learning. |
| 73 | + |
| 74 | + * Generative Adversarial Imitation Learning (GAIL). |
0 commit comments