Week 4 Part II · Training Infrastructure
Instructor lesson plan: lecture (3 h) and practice (2 h).
| 0:00–0:10 | 10 min | Recap & retrievalOpen with two quick questions on last week's material (retrieval practice), then state this week's objectives. |
| 0:10–0:25 | 15 min | MotivationModels are only as good as the data pipeline; leakage silently inflates results. |
| 0:25–1:10 | 45 min | Dataset and DataLoader
|
| 1:10–1:20 | 10 min | Break |
| 1:20–2:05 | 45 min | Splits and leakage
|
| 2:05–2:35 | 30 min | Live demo (predict, then run)Ask the class to predict whether the leaked-normalization accuracy will be higher or lower than the honest one, then show the inflated number. Write a custom Dataset and DataLoader, iterate batches, then show a normalization leak inflating accuracy and fix it. |
| 2:35–2:50 | 15 min | Wrap-up & practice previewRevisit the misconception and concept checks below, recap the takeaways, and preview the practice lesson. |
| 2:50–3:00 | 10 min | Buffer & questions |
Students often think: Normalizing the whole dataset before splitting is harmless, it is just scaling.
Set it straight: Fitting normalization statistics on all data leaks test information into training and silently inflates results; fit on the training split only, then apply to val and test.
In the practice lesson the instructor demonstrates implementations, runs code, and works through examples, using the practice notebook linked below. The weekly lab is then set as homework, where students apply this themselves.
| 0:00–0:10 | 10 min | Setup & recapRecap the lecture's key ideas and open the working notebook. |
| 0:10–1:00 | 50 min | Instructor demonstrations
|
| 1:00–1:05 | 5 min | Break |
| 1:05–1:45 | 40 min | Instructor demonstrations (continued)
|
| 1:45–2:00 | 15 min | Wrap-up & lab briefSummarize the patterns shown and brief the weekly lab (homework), which students complete on their own. |
Open the practice notebook in Colab Curated references Lab (homework)