Introduction to Deep Learning · HIT

Week 10   Part III · Architectures & Representation Learning

Recurrent Networks (RNNs)

Sequence data and recurrence; the RNN cell; backpropagation through time; vanishing and exploding gradients.

Learning goals

This is the weekly homework lab, completed independently after the lecture and the practice lesson. It follows the course's Build / Predict & probe / Explain & defend model: use an AI assistant freely for the Build; the graded learning is in Predict and Explain. See the AI-use policy and a fully worked sample submission.

Exercise

Part A · AI assistant welcomeBuild

  1. Build a plain RNN on a character-level or short time-series task.

Part B · student reasoningPredict & probe

  1. Predict how the gradient magnitude changes across time steps for long sequences.

Part C · in plain languageExplain & defend

  1. Measure gradient norms across time steps, demonstrate vanishing gradients on long sequences, and explain why long-range dependencies are hard for a plain RNN.

Deliverables

Hints.

Self-check

Answer each before expanding it. If one is unclear, revisit the lab and the references.

What does an RNN share across time steps?
The same weights (parameters).
What is backpropagation through time?
Backpropagation on the network unrolled across time steps.
Why do long sequences cause vanishing gradients?
Repeated multiplication shrinks (or explodes) gradients across many steps.
Name one remedy for exploding gradients.
Gradient clipping.
What is the hidden state?
The recurrent memory passed from one time step to the next.

Instructor lesson plan (with references)

PreviousWeek 9: Convolutional Networks IINextWeek 11: LSTMs, GRUs & Sequence Tasks