Week 11 Part III · Architectures & Representation Learning

LSTMs, GRUs & Sequence Tasks

Gated recurrent units; how gates restore gradient flow; LSTM versus GRU; sequence classification and sequence-to-sequence tasks.

Learning goals

Build an LSTM or GRU and compare it to the plain RNN.
Understand how gates restore gradient flow.
Apply gated networks to a sequence task.

This is the weekly homework lab, completed independently after the lecture and the practice lesson. It follows the course's Build / Predict & probe / Explain & defend model: use an AI assistant freely for the Build; the graded learning is in Predict and Explain. See the AI-use policy and a fully worked sample submission.

⚙Exercise

Part A · AI assistant welcomeBuild

Build an LSTM or GRU on the same task as Week 10 and compare it to the plain RNN.

Part B · student reasoningPredict & probe

Predict behavior on long versus short sequences and which gates matter most.

Part C · in plain languageExplain & defend

Ablate the gates, explain how gating preserves the gradient signal where the RNN failed, and compare LSTM with GRU.

✓Deliverables

An LSTM or GRU notebook with an RNN-versus-gated comparison.
A gate ablation with an explanation.

Hints.

Keep the task identical to Week 10 for a fair comparison.
A GRU is lighter than an LSTM; watch long-sequence accuracy.

❓Self-check

Answer each before expanding it. If one is unclear, revisit the lab and the references.

What do the gates in an LSTM control?

How much information to forget, add, and output from the cell state.

How does a GRU differ from an LSTM?

It merges gates and state into a simpler, lighter unit.

How does gating help with vanishing gradients?

The cell state provides a near-linear path that preserves the gradient signal.

What is teacher forcing?

Feeding the ground-truth previous token as the decoder input during training.

How does sequence classification differ from seq2seq?

One label for the whole sequence versus an output sequence (encoder-decoder).

Instructor lesson plan (with references)