Week 7 Part II · Training Infrastructure
Regularization & Generalization
Overfitting; dropout, weight decay, early stopping; basic data augmentation.
Learning goals
- Diagnose overfitting from the train and validation gap.
- Apply dropout, weight decay, early stopping, and augmentation.
- Attribute a generalization gain to a specific cause.
This is the weekly
homework lab, completed independently after the lecture and the practice lesson. It follows the course's
Build / Predict & probe / Explain & defend model: use an AI assistant freely for the Build; the graded learning is in Predict and Explain. See the
AI-use policy and a
fully worked sample submission.
⚙Exercise
Part A · AI assistant welcomeBuild
- First make a model clearly overfit (small data, large model), then add regularization to close the gap.
Part B · student reasoningPredict & probe
- Predict which regularizer will help most and how the train-minus-validation gap changes.
Part C · in plain languageExplain & defend
- Run an ablation (dropout, weight decay, augmentation) and explain which helped and why; critique a claim about where to place dropout.
✓Deliverables
- An overfit baseline plus a regularized run.
- An ablation table and an explanation.
Hints.- Watch the train-minus-validation gap, not just validation alone.
- Augmentation applies to the training split only.
❓Self-check
Answer each before expanding it. If one is unclear, revisit the lab and the references.
What signals overfitting?
A growing gap between low training loss and higher validation loss.
What does weight decay do?
Penalizes large weights (L2), reducing variance and overfitting.
Where should dropout be applied, and when is it disabled?
After activations in hidden layers; disabled at evaluation time.
Should augmentation be applied to validation and test?
No, only to the training set.
What is early stopping?
Stopping training once the validation metric stops improving.
Instructor lesson plan (with references)