Introduction to Deep Learning · HIT

Week 3   Part I · Foundations

MLPs & Backpropagation

Multilayer perceptrons; the forward pass; backpropagation mechanics via PyTorch autograd.

Learning goals

This is the weekly homework lab, completed independently after the lecture and the practice lesson. It follows the course's Build / Predict & probe / Explain & defend model: use an AI assistant freely for the Build; the graded learning is in Predict and Explain. See the AI-use policy and a fully worked sample submission.

Exercise

Part A · AI assistant welcomeBuild

  1. Implement an MLP (nn.Module) and train it on a small classification task with autograd.
  2. Implement the backward pass for one linear layer by hand (manual tensor ops).

Part B · student reasoningPredict & probe

  1. Predict the sign and rough magnitude of one weight's gradient after a single step on a toy example.
  2. Predict how training changes if the nonlinearity is removed.

Part C · in plain languageExplain & defend

  1. Verify the hand-computed gradient matches autograd within tolerance and explain any difference.
  2. Explain in words what each gradient tells its weight to do.

Deliverables

Hints.

Self-check

Answer each before expanding it. If one is unclear, revisit the lab and the references.

Why does an MLP need nonlinear activations?
Without them, stacked linear layers collapse into a single linear map.
What does backpropagation compute?
Gradients of the loss with respect to each parameter, via the chain rule.
Why call optimizer.zero_grad() each step?
PyTorch accumulates gradients, so they must be cleared before each backward pass.
What does .backward() do?
Computes and stores .grad for every tensor with requires_grad=True.
How can a computed gradient be checked for correctness?
Compare it to a finite-difference (numerical) estimate, e.g. with gradcheck.

Instructor lesson plan (with references)

PreviousWeek 2: Tensors & Data RepresentationNextWeek 4: Data Pipelines