Week 6 Part II · Training Infrastructure
Instructor lesson plan: lecture (3 h) and practice (2 h).
| 0:00–0:10 | 10 min | Recap & retrievalOpen with two quick questions on last week's material (retrieval practice), then state this week's objectives. |
| 0:10–0:25 | 15 min | MotivationSame model, different optimizer or learning rate, wildly different results. |
| 0:25–1:10 | 45 min | Gradient descent and its variants
|
| 1:10–1:20 | 10 min | Break |
| 1:20–2:05 | 45 min | Learning rate and dynamics
|
| 2:05–2:35 | 30 min | Live demo (predict, then run)Before the three-rate sweep, ask the class to rank the three curves (too small, good, too large) and then reveal them. SGD versus momentum versus Adam on one model, a three-rate sweep, and a step-decay schedule. |
| 2:35–2:50 | 15 min | Wrap-up & practice previewRevisit the misconception and concept checks below, recap the takeaways, and preview the practice lesson. |
| 2:50–3:00 | 10 min | Buffer & questions |
Students often think: A smaller learning rate is always the safer choice.
Set it straight: Too small crawls or stalls in a poor region; the rate must be in the right range, and schedules help. There is no universally safe tiny value.
In the practice lesson the instructor demonstrates implementations, runs code, and works through examples, using the practice notebook linked below. The weekly lab is then set as homework, where students apply this themselves.
| 0:00–0:10 | 10 min | Setup & recapRecap the lecture's key ideas and open the working notebook. |
| 0:10–1:00 | 50 min | Instructor demonstrations
|
| 1:00–1:05 | 5 min | Break |
| 1:05–1:45 | 40 min | Instructor demonstrations (continued)
|
| 1:45–2:00 | 15 min | Wrap-up & lab briefSummarize the patterns shown and brief the weekly lab (homework), which students complete on their own. |
Open the practice notebook in Colab Curated references Lab (homework)