Week 11 Part III · Architectures & Representation Learning

LSTMs, GRUs & Sequence Tasks

Gated recurrent units; how gates restore gradient flow; LSTM versus GRU; sequence classification and sequence-to-sequence tasks.

Curated, free, canonical references for this week: a course or lecture, a book chapter, a video, and an authoritative blog post or official tutorial. Each opens in a new tab.

Course

Dive into Deep Learning, Chapter 10: Modern Recurrent Neural Networks d2l.ai

Full sections on LSTM, GRU, and sequence-to-sequence machine translation.

Book

Dive into Deep Learning, 10.7 Sequence-to-Sequence Learning d2l.ai

Builds an encoder-decoder seq2seq model with teacher forcing and masked loss.

Video

StatQuest: Long Short-Term Memory (LSTM), Clearly Explained youtube.com

Step-by-step visual explanation of LSTM gates and the cell state.

Blog / Docs

Christopher Olah: Understanding LSTM Networks colah.github.io

The definitive illustrated walkthrough of LSTM gates, with a GRU variant section.

← Back to the Week 11 lab