Introduction to Deep Learning · HIT

Week 7   Part II · Training Infrastructure

Regularization & Generalization

Overfitting; dropout, weight decay, early stopping; basic data augmentation.

Learning goals

This is the weekly homework lab, completed independently after the lecture and the practice lesson. It follows the course's Build / Predict & probe / Explain & defend model: use an AI assistant freely for the Build; the graded learning is in Predict and Explain. See the AI-use policy and a fully worked sample submission.

Exercise

Part A · AI assistant welcomeBuild

  1. First make a model clearly overfit (small data, large model), then add regularization to close the gap.

Part B · student reasoningPredict & probe

  1. Predict which regularizer will help most and how the train-minus-validation gap changes.

Part C · in plain languageExplain & defend

  1. Run an ablation (dropout, weight decay, augmentation) and explain which helped and why; critique a claim about where to place dropout.

Deliverables

Hints.

Self-check

Answer each before expanding it. If one is unclear, revisit the lab and the references.

What signals overfitting?
A growing gap between low training loss and higher validation loss.
What does weight decay do?
Penalizes large weights (L2), reducing variance and overfitting.
Where should dropout be applied, and when is it disabled?
After activations in hidden layers; disabled at evaluation time.
Should augmentation be applied to validation and test?
No, only to the training set.
What is early stopping?
Stopping training once the validation metric stops improving.

Instructor lesson plan (with references)

PreviousWeek 6: OptimizationNextWeek 8: Convolutional Networks I