Week 7 Part II · Training Infrastructure

Regularization & Generalization

Overfitting; dropout, weight decay, early stopping; basic data augmentation.

Learning goals

Diagnose overfitting from the train and validation gap.
Apply dropout, weight decay, early stopping, and augmentation.
Attribute a generalization gain to a specific cause.

This is the weekly homework lab, completed independently after the lecture and the practice lesson. It follows the course's Build / Predict & probe / Explain & defend model: use an AI assistant freely for the Build; the graded learning is in Predict and Explain. See the AI-use policy and a fully worked sample submission.

⚙Exercise

Part A · AI assistant welcomeBuild

First make a model clearly overfit (small data, large model), then add regularization to close the gap.

Part B · student reasoningPredict & probe

Predict which regularizer will help most and how the train-minus-validation gap changes.

Part C · in plain languageExplain & defend

Run an ablation (dropout, weight decay, augmentation) and explain which helped and why; critique a claim about where to place dropout.

✓Deliverables

An overfit baseline plus a regularized run.
An ablation table and an explanation.

Hints.

Watch the train-minus-validation gap, not just validation alone.
Augmentation applies to the training split only.

❓Self-check

Answer each before expanding it. If one is unclear, revisit the lab and the references.

What signals overfitting?

A growing gap between low training loss and higher validation loss.

What does weight decay do?

Penalizes large weights (L2), reducing variance and overfitting.

Where should dropout be applied, and when is it disabled?

After activations in hidden layers; disabled at evaluation time.

Should augmentation be applied to validation and test?

No, only to the training set.

What is early stopping?

Stopping training once the validation metric stops improving.

Instructor lesson plan (with references)