Week 10 Part III · Architectures & Representation Learning
Recurrent Networks (RNNs)
Sequence data and recurrence; the RNN cell; backpropagation through time; vanishing and exploding gradients.
Learning goals
- Build a plain RNN for sequence data.
- Understand recurrence and backpropagation through time.
- Observe the vanishing-gradient problem directly.
This is the weekly
homework lab, completed independently after the lecture and the practice lesson. It follows the course's
Build / Predict & probe / Explain & defend model: use an AI assistant freely for the Build; the graded learning is in Predict and Explain. See the
AI-use policy and a
fully worked sample submission.
⚙Exercise
Part A · AI assistant welcomeBuild
- Build a plain RNN on a character-level or short time-series task.
Part B · student reasoningPredict & probe
- Predict how the gradient magnitude changes across time steps for long sequences.
Part C · in plain languageExplain & defend
- Measure gradient norms across time steps, demonstrate vanishing gradients on long sequences, and explain why long-range dependencies are hard for a plain RNN.
✓Deliverables
- An RNN notebook.
- A gradient-norm-versus-time-step plot with an explanation.
Hints.- Clip gradients to avoid explosion; start with short sequences.
- Log the gradient norm at the earliest time steps to see vanishing.
❓Self-check
Answer each before expanding it. If one is unclear, revisit the lab and the references.
What does an RNN share across time steps?
The same weights (parameters).
What is backpropagation through time?
Backpropagation on the network unrolled across time steps.
Why do long sequences cause vanishing gradients?
Repeated multiplication shrinks (or explodes) gradients across many steps.
Name one remedy for exploding gradients.
Gradient clipping.
What is the hidden state?
The recurrent memory passed from one time step to the next.
Instructor lesson plan (with references)